ONTAP Hardware

DataONTAP SCOM MP false readings

Stuart_Cookson
1,876 Views

He;llo! We have recently installed the DataONTAP SCOM MP to monitor our NetApp environment.  Almost immediately we received a lot of LUN and Volume Latency alerts, most of them are for between 20 and 100ms but some of them are in the 1000s!

Our NetApp support people have checked the latency via some Perfstats that I have supplied and they say there are no latency issues.  Some of the SCOM alerts triggered during the time the perfstats were being collected but these did not sure in the perfstats.  They are suggesting that SCOM is misreading the latency information but I can't find anything in the MP documentation for this.

 

Has anyone else had similar issues with LUN and Volume latency in the SCOM MP?

 

Many thanks

 

Stuart

1 ACCEPTED SOLUTION

Stuart_Cookson
1,826 Views

For information I think we have figured this out.  We believe that SCOM is reading the volume latency in Microseconds rather than Milliseconds.  We have been looking in NetApp Management Console System Performance Monitoring and noticed that it is displaying volume latency in Microseconds.  There are 1000 Microseconds to 1 millisecond and when we compare Management Console to SCOM the figures being reported are more aligned to the Microseconds readings rather than milliseconds.

 

I've created overrides for 10000 for critical instead of 10 and 5000 for warning instead of 5.

 

To confuse matters more the LUN latency in mangaement consoles are monitored in Milliseconds!!

 

I don't know if i've read the MP documentation wrong but hopefully this may help some people in the future.

View solution in original post

1 REPLY 1

Stuart_Cookson
1,827 Views

For information I think we have figured this out.  We believe that SCOM is reading the volume latency in Microseconds rather than Milliseconds.  We have been looking in NetApp Management Console System Performance Monitoring and noticed that it is displaying volume latency in Microseconds.  There are 1000 Microseconds to 1 millisecond and when we compare Management Console to SCOM the figures being reported are more aligned to the Microseconds readings rather than milliseconds.

 

I've created overrides for 10000 for critical instead of 10 and 5000 for warning instead of 5.

 

To confuse matters more the LUN latency in mangaement consoles are monitored in Milliseconds!!

 

I don't know if i've read the MP documentation wrong but hopefully this may help some people in the future.

Public