I am running OPM 2.0 monitoring a two 6 node clusters running OnTap 8.2.1. I've created a custom threshold for "node utilization" at 90% warning 95% critical over 5 minutes. Looking back on the last 7 days, this threshold has been crossed 12 times...sometimes for an hour or more. I know for a fact that this has not been the case. Someone has gone onto the cluster and checked the CPU when this is going on in OPM and this high utilization is not reflected on the node when viewed on the cluster. Has anyone else experienced this or know of a particular bug this hits? I kknow some people are already getting bombarded by the system defined policy for utilization. This is a bit the same in that we are getting spammed for alerts that don't seem to be accurate.
I already know about the system defined policy and have turned it off. Mine is a user defined threshold. It seems like node utilization as a whole is reporting inaccurate/unrealistic values. I've atached a screenshot of a node that is much worse than the one I had looked at before posting this thread.