I've been racking my brain for two days trying to come to some conclusion as to why we are seeing less than impressive FCP throughput on our FAS. Little background here:
- 5 x ESX 4.x+
- All using FC SAN Protocols
- HUK installed
- ALUA Enabled
- 1 LUN (Datastore) per volume
- LUN and VMDK are properly aligned from which I did my testing.
- FC HBA w/ Dual Channels
- +-10 VMs on SAN
- FAS2020 HA Pair
- 7.3.6
- 1st Filer (primary production)
- 1 x aggr0
- Currently 24 x FC 15k 300GB disks in two equal RAID groups (about to add another 12 disk RG)
- 3 DS14MK4 shelves in 2GB FCAL
- 4GB FC to Brocade Fabric
- 2 x Brocade 200E Silkworm Fabric
I have done some extensive research and have found that a few people have mentioned that they were getting low throughput speeds. My question is, "Is this to be expected on a FAS2020 or can I do better?"
Here are my test results from IOmeter. I have also done some testing with SIO, but with ultimately the same results. As you can see with from the results below, IOps are not an issue (at least for this guy). My problem seems to be when we start to grow the size, in the example of SQL (64k) and the Troughtput Stress (1MB) tests (all tests ran with 128 outstanding IOs in both SIO and IOmeter), throughput seems to be the bottleneck.
Data Store | Access Specification Name | # Managers | # Workers | # Disks | IOps | Read IOps | Write IOps | MBps | Read MBps | Write MBps |
FC | 4K Real Life Mix | 1 | 1 | 1 | 3905.133 | 2925.615695 | 979.517742 | 15.25443 | 11.428186 | 3.826241 |
FC | NetApp Recom. 100% Write 100% Random | 1 | 1 | 1 | 4912.214 | 0 | 4912.214343 | 19.18834 | 0 | 19.188337 |
FC | SQL Realworld | 1 | 1 | 1 | 697.6488 | 459.093069 | 238.555713 | 43.60305 | 28.693317 | 14.909732 |
FC | Exchange 2003 | 1 | 1 | 1 | 4291.202 | 2563.780681 | 1727.421739 | 16.76251 | 10.014768 | 6.747741 |
FC | Throughput Stress | 1 | 1 | 1 | 38.9281 | 38.928096 | 0 | 38.9281 | 38.928096 | 0 |
Sysstat, lun stats, fc stats all match what IOmeter is seeing verbatim so I feel confident instrumentation is not skewing my results.
My reason for highlighting the 2GB FCAL above is because I wonder if this might be my issue. I have yet to find a way to get any stats from a AL port. I have looked over which describes what I'm seeing but really doesn't provide any answers.
Have verified alignment from VMFS to VMDK from a perfstat and SMC.
I have even tried hammering the filer from multiple ESX servers and they go 20 MB/s and 20 MB/s. 40 MB/s seems to be about it. Does any one have any suggestions as to where I can start looking before calling support?