acoffey001,
Based on your description, you are not suffering from a *storage array* performance issue. You're likely suffering from a well known, but little understood, host-based filesystem performance issue. With all major operating systems, attempting to read large numbers of small files results in a large amount of O/S system overhead. This is because more time is spent at the O/S level performing seek(), open(), and close() operations on each and every file you're processing. While these operations don't necessarily take a lot of time for a single file, they quickly rack-up when processing hundreds or thousands of small files. These problems manifest when performing activities like backups, virus scans, content indexing, etc... With really small files like 4K, you can literally spend more time finding, opening and closing the file than you do reading the data. One customer who had over 1 million small files, we calculated that they were losing multiple *hours* of time due to filesystem processing during backup windows. The fact that this VM resides on top of VMFS probably adds even more latency, but I don't know how much.
Once the O/S finds the file in the file system and then opens the file, that is when it actually gets around to reading the contents of the file and communicating with the storage array. That's why everything looks OK from the storage array performance perspective. The array is responding very quickly to the read() request from the O/S, especially with your high cache hit rate.
I created a document a few years ago which explained this issue. It included a few examples that actually showed the O/S system calls as it processed each small file and how much time each operation took. I'll see if I can dig that up and I'll post it here.
Reid