2010-09-06 08:46 PM
My CIFS are having performance issue. the daily CPU usage is 80-90%. And so, when system doing snapshot.
the cpu will spike to 100%.
cifs stat 10
GetAttr Read Write Lock Open/Cl Direct Other
113626497782 33474482130 1294930546 767394214 60353397035 9863767165 26313299
46118 13394 475 301 24960 4007 11
46332 14480 564 346 24191 3978 10
42864 11059 402 243 20880 3534 26
43499 12055 430 281 21502 3585 18
43934 11888 403 297 21675 3573 9
45586 13033 373 279 24417 3877 24
45639 12739 432 276 24351 3889 10
48137 14694 564 359 26145 4356 6
50359 15435 541 326 29034 4658 25
2010-09-07 12:12 PM
I would suggest that you set cifs.audit.enable to off. Is there a reason why it was set to on, as the default setting is off? This option logs all the cifs access from Windows clients on the controller. If you have a clustered system, check it's partner and turn off cifs.audit if it is enabled.
Also, you should check your /etc/log directory for event logs (this is specified in the option "cifs.audit.saveas /etc/log/adtlog.evt") as you may be filling up the /etc directory and the snapshots for /etc. The File Access and Protocols Management Guide has more information about configuring auditing for both CIFS and NFS.
Start with turning off the auditing to eliminate one potential bottleneck. Hope this helps.
2010-09-07 10:21 PM
-i have turn off the audit option. right now, i'm monitoring the result. As for /etc/ space. it doesn't get full yet. No issue..
-Is there any other possibility cause of high CPU?
-is there any ways we can trace who used up the CPU? As far i can tell CIFS cause it. How to narrow it down?
please help, Thanks
2010-09-08 12:03 AM
Is it high all the time, or just periodically?
What other purposes is the system used for?
Please check the syslog, last time we had similar issue there was NVRAM failure, after replacement it became OK.
Another case we had on a system serving as destination for SnapVaults and tape backups (was DOT 7.2.4 that time) - reboot helped, might have been a memory leak, never got any confirmation from NetApp.
2010-09-08 12:40 AM
-Yes it is high all the time. 70-80% and spike to 100% when snapshot create and delete.
-mainly for CIFS
-couldn't find anything strange on the log.
th other option to trace is by enable cifs.per_client_stats.enable , then using "cifs top" to trace it.
But, i don't dare to do it because it cause overhead associated with collecting the per-client stats.
This overhead may affect filer performance.
-is there a way to trace it without affecting current performance? Please help..
2010-09-10 04:55 AM
What model of filer is this? Which OnTAP version?
The best way for you would probably be to file a support request with your reseller. We debug performance problems like this quite often and there are so many factors that could be involved.
*extensive CIFS logging/auditing
*volume fill rates >80-85%. check "df -h"
*volume fragmentation. check "reallocate measure /vol/<volname>"
*maybe it's simply too much I/O for your system
*more disks/shelves could also help improve I/O performance
*SMBv2 features that have vastly improved in newer versions of OnTAP
etc. etc. etc.
There's so much to consider which makes it very hard to debug via the community forum
2010-09-10 07:05 AM
run a "sysstat -x 1" and check for "disk util", it shows the highest utilization a single disks has. If its 80%+, your disks are the bottlenet. Besides that, please post a "sysconfig -r" and "aggr status -v" output for us to check if your aggr & volume layout is correctly.
2010-09-20 01:37 AM
i look at the I/O, looks OK. But, the cache hit is 99%. Do need to increase cache or what?
BTW, the version
version NetApp Release 7.2.4P7: Fri Apr 11 00:22:07 PDT 2008
#sysstat -s -u 1