Active IQ Unified Manager Discussions

dfm alert - perf:system:avg_processor_busy:breached vs CPU too busy

collieritops
3,220 Views

In the past couple weeks I just started to see these perf:system:avg_processor_busy:breached errors from DFM. I never saw these before. We would often see CPU too busy overnight during heavy backup load, etc...but not this other error. So I'm actually getting alerts for both now. What's the different between CPU too busy and avg_processor_busy:breached? Why would I get only CPU too busy before and now start getting both?

An Error event at 13 May 00:04 Eastern Daylight Time on Active/Active Controller myfiler:

perf:system:avg_processor_busy:breached.

The following counter value(s) have breached the specified threshold(s):

Current value of counter system:avg_processor_busy is 52.6734 percent, which is higher than the threshold value of 50 percent.

A Warning event at 13 May 00:44 Eastern Daylight Time on Active/Active Controller myfiler:

CPU Too Busy, CPU utilization was 97.96%.

cpuTooBusyThreshold is 95 where system:avg_processor_busy threshold is 50...so I would think that CPU too busy is more of a concerning alert?

How concerned should I be about seeing these alerts?

Any information is greatly appreciated...

Thx

2 REPLIES 2

rbalaji
3,220 Views

These 2 counters are different and have their own meaning.

Here is the definition on the counters

cpu_busy
Percentage  of time one or more processors is busy in the system
Note: For  systems running
Data ONTAP versions 7.2 or earlier, the cpu_busy  counter is the amount of time that any one CPU is busy. This results in a  value for cpu_busy that is inflated. For systems running Data ONTAP  versions 7.2.1 or later, the cpu_busy counter is the greater of either  average CPU utilization or the busiest domain.

avg_processor_busy
Average  processor utilization across all processors in the system

collieritops
3,220 Views

1. Do you think these alerts are indicative of a problem? I know there are a ton of variables and you don't know my environment -- I'm just wondering is it fairly typical to see these? highly unusual? etc They're mainly during heavy backup load at night (NDMP, snapshots, etc).

2. Before I only would see CPU too busy at night. Recently I started to see CPU too busy + avg_processor_busy. Any thoughts on why or what that means?

2. Can you explain this part a little more? "For systems running Data ONTAP  versions 7.2.1 or later, the cpu_busy counter is the greater of either  average CPU utilization or the busiest domain." I'm not sure what they're referring to with "busiest domain".

Public