VMware Solutions Discussions

FAS2040 cpu peak between 13:00-13:10

penguinssh
5,320 Views

Hello guys,

We have a FAS2040 hosting VMware luns and Exchange luns using ISCSI.  The environment is rather small and performance is nice overall. However, we've noticed that everyday, the both CPUs of the second controller peak at about 90-95%.  This results in a drop of overall IOP/s and affects all the LUNs.  This occurs precisely between 13:00 and 13:10.  Not before, not after. 

No specific jobs are supposed to be running at that time either on the Netapp side, Exchange side or VMware side (including servers).  I don't see any tasks that would do this other than one on the NetApp controller. 

I've attached some diagnostics command in priv set diag. 

sysstat -x 2

sysstat -M 1

and so on. 

I've determined that the Kahuna domain was taking about 70-80% of the CPU between 13:00 and 13:10.  It normally takes about 10-15%.  Now, I don't know where to look to determine what is running behind the Kahuna process. 

Your help in pinpointing where is the issue or the task running would be greatly appreciated. 

Thanks.

9 REPLIES 9

radek_kubka
5,320 Views

Hi and welcome to the Communities!

Quick stab at your problem - any chance any dedupe scans are (rather awkwardly) scheduled to start at 13:00?

Regards,
Radek

penguinssh
5,320 Views

Thanks for the welcome

Nope nothing is set for deduplication at 13:00. 

nastorage2*> sis status
Path                           State      Status     Progress
/vol/Vol1Vmware                Enabled    Idle       Idle for 12:34:55
/vol/Vol2Vmware                Enabled    Idle       Idle for 10:51:14
Do you know if there is a way to check if something is scheduled to run at a specific time on ONTAP?  Apart from that, we have snapshots running at 12:00PM but they all finish around 12:02-12:05 and they don't cause this CPU peak!

radek_kubka
5,320 Views

How about any snapmirror updates? Nothing?

penguinssh
5,320 Views

Nope nothing in Snapmirror or Snapvault. 

Is there a way to see what processes are behind Kahuna domain? Or see any scheduled processes?

radek_kubka
5,320 Views

Is there a way to see what processes are behind Kahuna domain? Or see any scheduled processes?

Not that I know of - but we will see what other folks say.

roman_verysell
5,320 Views

Reallocation is enabled and 'reallocate schedule' is applied to 13:00 (1:00pm instead of 1:00am), maybe?

rorzmcgauze
5,320 Views

Dont believe there is a way to see what its doing but Kahuna looks after WAFL, RAID tetris, clustering and admin commands to name the main parts, and code that needs to run serially.

rorzmcgauze
5,320 Views

Looks like something is hammering the disks at that time and the system is having problems keeping up and is constantly doing consistancy points. Also the disk utilisation is very high even for a SAS/FC system. Seems strange tho you are only seeing this during that specific time.

As this is a 2000 system have you downloaded DFM and used ops mgr (inc lic with system)? in there is the Performance advisor module which would help with the monitoring of the system.

penguinssh
5,320 Views

Found it!

I looked through the NetApp Management Console and one of the luns would bump the total throughput/sec from ~1 000 000 to ~50 000 000 bytes per seconds.  That lun is a VMware lun which is hosting a SQL Server.  For some reason, the DBA scheduled transactional logs backup and a database dump at that specific time. 

I'll have to talk to the DBA for this.  Thanks all for your help, it helped me understand more Netapp controllers and how awesome they are

Public