2016-04-11 09:28 PM
I am running OPM 2.0 monitoring a two 6 node clusters running OnTap 8.2.1. I've created a custom threshold for "node utilization" at 90% warning 95% critical over 5 minutes. Looking back on the last 7 days, this threshold has been crossed 12 times...sometimes for an hour or more. I know for a fact that this has not been the case. Someone has gone onto the cluster and checked the CPU when this is going on in OPM and this high utilization is not reflected on the node when viewed on the cluster. Has anyone else experienced this or know of a particular bug this hits? I kknow some people are already getting bombarded by the system defined policy for utilization. This is a bit the same in that we are getting spammed for alerts that don't seem to be accurate.
2016-04-11 10:09 PM
This is due to system-defined threshold policies. https://library.netapp.com/ecmdocs/ECMP12406790/html/GUID-F76DC60C-852E-485D-91E7-A88683AF6C55.html
To not receive notifications through email when this condition occurs, from the GUI perform the following steps:
Navigate to Configuration ---> Event Handling --> System Defined Threshold Policies, from the GUI.
Under Warning Events, uncheck Send alert notifications to.
2016-04-11 10:15 PM
I already know about the system defined policy and have turned it off. Mine is a user defined threshold. It seems like node utilization as a whole is reporting inaccurate/unrealistic values. I've atached a screenshot of a node that is much worse than the one I had looked at before posting this thread.