2017-12-18 07:50 AM
I am continuously getting OCUM alerts for Perf. Capacity Used value of 212% on nodename1 has triggered a WARNING event based on threshold setting of 100%.
When I check the performance view I see things like 25K IOPs, ~500MB/s throughput and latency <1ms... so, in my opinion, nothing obviously grinding to a halt here.
The OCUM manual tries to explain what "Perf Capacity" is. But what is the community's take on these alerts and how best to address/resolve them in the real world? If this is a case of over-alerting I'd like to know that as well.
2017-12-20 09:26 AM
Performance Capacity Used would get to levels where a warning is generated only if latency increase is observed at the node processing level. However, the increase in latency is a relative rather than an absolute measure. As a result, if this warning is a real concern for the environment depends on the workloads that are running. There are workloads that may be sensitive to such changes and there are workloads that may be not. Hence, a warning is generated as a precaution and the admin needs to make the final judgement.
There may be cases, when the latency increase is temporary, either due to a temporary load increase or temporary change in the workload demand. If this warning in persisting or occurs periodically then it should be taken into consideration, because it represents a proactive warning that cautions the user of performance issues if further load is added into the node.