Subscribe

CIFS performance problem

Hi All,

My CIFS are having performance issue. the daily CPU usage is 80-90%. And so, when system doing snapshot.

the cpu will spike to 100%.

sysstat
CPU    NFS   CIFS   HTTP      Net kB/s     Disk kB/s      Tape kB/s    Cache
                               in   out     read  write    read write     age
82%   7127   9003      0    8177 39226    31073   2606       0     0       1
91%   6785   9091      0   12189 37056    34302  12137       0     0       1
84%   9599   9360      0    9488 32152    29497   6466       0     0       1
96%   7827   9637      0   15959 38404    33400  16630       0     0       1
91%   9361  10466      0    6490 34312    27940   3994       0     0       1
90%   8826  10128      0    4958 32662    29347   4403       0     0       1
96%   6339   9484      0   10731 34338    32835   8322       0     0       1
94%   5309   9929      0   13300 35917    35079  15151       0     0       1

cifs stat 10

  GetAttr      Read     Write      Lock   Open/Cl    Direct     Other

113626497782  33474482130  1294930546  767394214  60353397035  9863767165  26313299

    46118     13394       475       301     24960      4007        11

    46332     14480       564       346     24191      3978        10

    42864     11059       402       243     20880      3534        26

    43499     12055       430       281     21502      3585        18

    43934     11888       403       297     21675      3573         9

    45586     13033       373       279     24417      3877        24

    45639     12739       432       276     24351      3889        10

    48137     14694       564       359     26145      4356         6

    50359     15435       541       326     29034      4658        25

and i found the cifs.audit.enable was on. can this cause the high CPU. here is my cifs.audit option.
options cifs.audit
cifs.audit.account_mgmt_events.enable off
cifs.audit.autosave.file.extension
cifs.audit.autosave.file.limit 0
cifs.audit.autosave.onsize.enable off
cifs.audit.autosave.onsize.threshold
cifs.audit.autosave.ontime.enable off
cifs.audit.autosave.ontime.interval
cifs.audit.enable            on
cifs.audit.file_access_events.enable on
cifs.audit.liveview.enable   off
cifs.audit.logon_events.enable on
cifs.audit.logsize           524288
cifs.audit.nfs.enable        off
cifs.audit.nfs.filter.filename
cifs.audit.saveas            /etc/log/adtlog.evt
thanks for the help.

Re: CIFS performance problem

Hi Guys,

Really need your assistance to solve this matter.

Please let me know if you need more info.

thanks

Re: CIFS performance problem

I would suggest that you set cifs.audit.enable to off. Is there a reason why it was set to on, as the default setting is off? This option logs all the cifs access from Windows clients on the controller. If you have a clustered system, check it's partner and turn off cifs.audit if it is enabled.

Also, you should check your /etc/log directory for event logs (this is specified in the option "cifs.audit.saveas  /etc/log/adtlog.evt") as you may be filling up the /etc directory and the snapshots for /etc. The File Access and Protocols Management Guide has more information about configuring auditing for both CIFS and NFS.

Start with turning off the auditing to eliminate one potential bottleneck. Hope this helps.

Susan

Re: CIFS performance problem

Thanks Susan

-i have turn off the audit option. right now, i'm monitoring the result. As for /etc/ space. it doesn't get full yet. No issue..

-Is there any other possibility cause  of high CPU?

-is there any ways we can trace who used up the CPU? As far i can tell CIFS cause it. How to narrow it down?

please help, Thanks

Re: CIFS performance problem

Is it high all the time, or just periodically?

What other purposes is the system used for?

Please check the syslog, last time we had similar issue there was NVRAM failure, after replacement it became OK.

Another case we had on a system serving as destination for SnapVaults and tape backups (was DOT 7.2.4 that time) - reboot helped, might have been a memory leak, never got any confirmation from NetApp.

Re: CIFS performance problem

-Yes it is high all the time. 70-80% and spike to 100% when snapshot create and delete.

-mainly for CIFS

-couldn't find anything strange on the log.

th other option to trace is by enable cifs.per_client_stats.enable , then using "cifs top" to trace it.

But, i don't dare to do it because it cause overhead associated with collecting the per-client stats.

This overhead may affect filer performance.

-is there a way to trace it without affecting current performance? Please help..

Re: CIFS performance problem

What model of filer is this? Which OnTAP version?

The best way for you would probably be to file a support request with your reseller. We debug performance problems like this quite often and there are so many factors that could be involved.

Some examples:

*extensive CIFS logging/auditing

*volume fill rates >80-85%. check "df -h"

*volume fragmentation. check "reallocate measure /vol/<volname>"

*maybe it's simply too much I/O for your system

*more disks/shelves could also help improve I/O performance

*SMBv2 features that have vastly improved in newer versions of OnTAP

etc. etc. etc.

There's so much to consider which makes it very hard to debug via the community forum

-Michael

Re: CIFS performance problem

Hi there,

run a "sysstat -x 1" and check for "disk util", it shows the highest utilization a single disks has. If its 80%+, your disks are the bottlenet. Besides that, please post a "sysconfig -r" and "aggr status -v" output for us to check if your aggr & volume layout is correctly.

Kind regards

Thomas

Re: CIFS performance problem

Could you please send us the output of options cifs

further more cifs stat would be nice.

try:

cifs.smb2.signing.required   off

cifs.max_mpx 50 (try increasing this to 126, 253 or 1124)

Re: CIFS performance problem

Hi All,

i look at the I/O, looks OK. But, the cache hit is 99%. Do need to increase cache or what?

BTW, the version

version                NetApp Release 7.2.4P7: Fri Apr 11 00:22:07 PDT 2008

Model Name:         FAS3020

#sysstat -s -u 1

CPU   Total    Net kB/s    Disk kB/s    Tape kB/s Cache Cache  CP  CP Disk
       ops/s    in   out   read  write  read write   age   hit time ty util
95%   19929  5066 42300  37052      8     0     0     3   99%   0%  -  70%
93%   17804  4127 28514  32748  16799     0     0     3   99%  84%  T  71%
94%   16874  4423 32560  27702   2630     0     0     3   99%  16%  :  61%
89%   18591  4826 38286  23155      0     0     0     3   99%   0%  -  51%
90%   20254  5134 43737  22705      0     0     0     3   99%   0%  -  54%
93%   18424  5273 52143  24155     32     0     0     3   99%   0%  -  44%
93%   18457  4644 50198  25574      0     0     0     3   99%   0%  -  45%
90%   17536  4776 49377  30262      0     0     0     3   98%   0%  -  57%
84%   20655  5729 46937  15242     24     0     0     3   99%   0%  -  62%
--
Summary Statistics (    9 samples  1 secs/sample)
CPU   Total    Net kB/s    Disk kB/s    Tape kB/s Cache Cache  CP  CP Disk
       ops/s    in   out   read  write  read write   age   hit time ty util
Min
84%   16874  4127 28514  15242      0     0     0     3   98%   0%  *  44%
Avg
91%   18724  4888 42672  26510   2165     0     0     3   99%  11%  *  57%
Max
95%   20655  5729 52143  37052  16799     0     0     3   99%  84%  *  71%
thanks for the help