I've got an odd one and I'm hoping some of the SQL gurus can help me.
Windows Server 2008 R2
Microsoft Failover Clustering
Microsoft SQL 2008 R2
FAS 3160 w/ 7.3.6 (igroup ALUA enabled, MS DSM)
32- 600GB SAS disks in 2x 16 disk rg's: 79% utilized/provisioned.
2x Cisco DS-X9148 4GB/s FC blades
HP BL460c G7 w/ FC expansion card and two passthrough switches.
DBA contacted me concerned because she's seeing disk latency >10s (not ms, sec!) in perfmon for the last 3m. I was a little surprised as dfm sends me an e-mail any time there's >50ms of latency for 60s. I started digging into it and found a momentary spike up to 100ms when the performance problems started. Then things settled down: ~15ms. For the next hour perfmon was consistently showing latency >10s but OnTap was showing 15ms or lower (average of 9ms.) This is a huge disconnect.
My first thought was he had an SFP flaking out but there's no sign of that in the HP, Cisco, or NetApp error logs.
I did have a number of "B" (back to back) CP's at the time but no "b" (deferred). NetApp CPU was around 60% utilized (that's average across the four cores, not the level of the highest core that for some reason sysstat still shows by default.)
So, if anyone has any ideas on why MS perfmon shows >10s latency but OnTap is recording <15ms I'd appreciate it.