2012-02-01 04:05 AM
Hi guys, we are facing a performance issue on our storage system. We have two FAS3140 controllers with an average cpu usage of 80%, the user experience is too slow, I run a few commands and im seeing a huge time access for the volumes (a short example):
CNTL-2> stats show -i 10 volume:NH2_7K_SDC_NAS:read_latency volume:NH2_7K_SDC_NAS:write_latency
Instance read_latency write_latenc
NH2_7K_SDC_NAS 176746.92 255912.20
NH2_7K_SDC_NAS 197769.45 257111.93
NH2_7K_SDC_NAS 166181.92 438517.48
NH2_7K_SDC_NAS 208290.45 340686.83
NH2_7K_SDC_NAS 173304.80 237109.22
NH2_7K_SDC_NAS 210693.25 275884.64
NH2_7K_SDC_NAS 162281.29 300198.90
NH2_7K_SDC_NAS 156559.38 283601.79
Same on controller #1 for other volume.
I dont know if this is refered to storage space and fragmentation or it's a problem regarding IOPS cause im seeing a few volumes beeing accesed a lot:
CNTL-2> stats show -i 10 volume:SARM_TMG_B:read_ops volume:SARM_TMG_B:write_ops volume:SARM_TMG_B:total_ops
Instance read_ops write_ops total_ops
/s /s /s
SARM_TMG_B 50 930 1013
SARM_TMG_B 99 1140 1279
SARM_TMG_B 93 963 1092
SARM_TMG_B 89 1028 1156
SARM_TMG_B 92 1009 1138
SARM_TMG_B 86 1057 1184
Our infrastructure on disks is: 2xDS4243 SAS (24*450GB) and 1xDS4243 SATA (24*1TB) per Head Unit (and we have two)
2012-02-01 07:09 AM
You haven't stated which drives the volumes are on, but on average you get 75 to 100 iops on SATA drives. Always work with the 75 mark, because that is the minimum you can expect. With 24 drives, assuming RAID-DP and two spares, you get 20*75=1500 iops. You are averaging 1,144 iops for 76% utilization.
Assuming the same information with the 450G drives and that they are 15k SAS drives, you are looking at 175-200, so again using 175, RAID-DP, two spares 175*20=3500 iops, so you should not be disk bound.
What are "sysstat -c 60 -s -x 1" and "sysstat -M -c 20 -s -x 1" showing? The first command grabs 60 seconds worth of general statistics information, in one second intervals, and then prints a summary of high/low/average information. The second show how the multiple CPUs really are being used, the generic "sysstat -x" output should not be used to monitor actual CPU usage on multi-core systems.
Also are you running NFS on the volume(s)? NFS mount settings are very important, things like noac and actimeo=0 in the client mount settings will bring a 3140 to its knees.