ONTAP Discussions

FAS3250 CPU utilization is high

DHAKSHINAMOORTHY_B
4,417 Views

Hi All,

I am managing NetApp FAS3250 HA pair and it serving block level data to multiple customers. Nowadays i am getting high CPU utilization alerts from the controllers.

DOT:  8.1.3P3 7-Mode

Deduplication disabled

No snapmirror

No NFS/CIFS services configured

Please suggest what are all steps to identify the cause of this high CPU utilization.

3 REPLIES 3

ekashpureff
4,417 Views

Dhakshinamoorthy -

There're numerous factors that will affect CPU.

You can look at which domains of Data ONTAP are using CPU with the advanced 'statit' command, and 'sysstat' with the -M flag.

You can try to open a support case, and they'll probably ask you to run a 'perfstat' report for them.

I hope this response has been helpful to you.

At your service,

Eugene E. Kashpureff

Independent NetApp Consultant, K&H Research http://www.linkedin.com/in/eugenekashpureff

Senior NetApp Instructor, Unitek Education http://www.unitek.com/training/netapp/

rmatsumoto
4,417 Views

If it's predictable, then I'd recommend you engage NetApp support and they'll have you collect perfstat and they can tell you.  If you want to go the DIY route AND you have DFM/OCUM(I'm guessing you might, since you're getting alerts) in place, then you may be able to match the high CPU time to some activity on your filer(like someone running a full table scan on a DB sitting on one of your volumes, for example).   This would show up in volume:total_ops and/or volume:throughput.  You can do this using the NMC or DFM CLI. 

If you go the DIY route, then I'd watch the latency numbers over CPU number personally, especially if you're using DFM/OCUM.  Also, considering you're on an 8.1.x release, you should make sure you're watching system:avg_proessor_busy(or processor:processor_busy) and not system:cpu_busy.   The latter will report higher numbers and will have less correlation with latencies compared to avg_processor_busy.  That's just my observation so YMMV. 

ANANDONTAP
4,417 Views

Hi Dhakshina,

kindly look at your aggregate usage details and no of spinddle avaailability.

i think it is overloaded of DB servers. during the CPU spike time any kind of backup inprogress.

Public