I have a question about CIFS from Netapp filers. When I enable CIFS on my vfilers, I get strange low throughput from CIFS (from clients). Like for instance: copying folder containing 20k small files (for website) from CIFS to local disk (on dedicated windows 2008 r2 server) goes with speed ~1.5MB/s even less (drops to 400k/s). First I thought it's because of small files, but I did a test with the same folder, with 2 physical servers (windows 2008 r2) over unc shares - there was throughput of 15MB/s constantly (so 10x more).
Ok, disks are shared for different things as well (14x SATA disks in aggregate) used for NFS as well, but their load is avg 60% (each disk), so even though it should give much better performance...
I am confused about this speed, and what should be expected in that setup. Someone could say: change to FC disks, but still these 2 physical servers also have sata disks inside, and then copying just performs normally (as it should).
Can anyone point me to right direction of this, or even better - what should be expected from CIFS over Netapp? Maybe 1.5MB with small files is max what Netapp can push?
BTW, I did a test with bigger file as well (2GB) and copying was going 20MB/s (so better, but still didn't max 1Gbit connection)... And the same file (2gb) went between 2 servers with speed 100MB/s
I have seen some horrendous performance with CIFS and filers and am currently still trying to nail down an issue where an lacp vif has 15Mb/s to some servers while a multimode works fine, doesnt help load balancing at all. One thing to check (if you can) is change the mode if you are using lacp. SMB2 is disabled by default on filers, turning that on will give you a boost for 2008R2. increasing the cifs tcp window size can increase performance in some circumstances as well.
NetApp have told me in previous support calls regarding CIFS that NetApps implementation of CIFS doesn't match microsoft's and not to expect the same speed, but with SMB2 and a chunky R710 with a decent intel card can pull at 100MB/s (on a good day without lacp turned on, or with if I'm very lucky)
• Use "sysstat –x 1" to determine how many CIFS ops/s and how much CPU is being utilized
• Check /etc/messages for any abnormal messages, especially for oplock break timeouts
• Use "perfstat" to gather data and analyze (note information from "ifstat", "statit", "cifs stat", and "smb_hist", messages, general cifs info)
• "pktt" may be necessary to determine what is being sent/received over the network
• "sio" should / could be used to determine how fast data can be written/read from the filer
• Client troubleshooting may include review of event logs, ping of filer, test using a different filer or Windows server
• If it is a network issue, check "ifstat –a", "netstat –in" for any I/O errors or collisions
• If it is a gigabit issue check to see if the flow control is set to FULL on the filer and the switch
• On the filer if it is one volume having an issue, do "df" to see if the volume is full
• Do "df –i" to see if the filer is running out of inodes
• From "statit" output, if it is one volume that is having an issue check for disk fragmentation
• Try the "netdiag –dv" command to test filer side duplex mismatch. It is important to find out what the benchmark is and if it’s a reasonable one
• If the problem is poor performance, try a simple file copy using Explorer and compare it with the application's performance. If they both are same, the issue probably is not the application. Rule out client problems and make sure it is tested on multiple clients. If it is an application performance issue, get all the details about:
◦ The version of the application
◦ What specifics of the application are slow, if any
◦ How the application works
◦ Is this equally slow while using another Windows server over the network?
◦ The recipe for reproducing the problem in a NetApp lab
• If the slowness only happens at certain times of the day, check if the times coincide with other heavy activity like SnapMirror, SnapShots, dump, etc. on the filer. If normal file reads/writes are slow:
◦ Check duplex mismatch (both client side and filer side)
◦ Check if oplocks are used (assuming they are turned off)
◦ Check if there is an Anti-Virus application running on the client. This can cause performance issues especially when copying multiple small files
◦ Check "cifs stat" to see if the Max Multiplex value is near the cifs.max_mpx option value. Common situations where this may need to be increased are when the filer is being used by a Windows Terminal Server or any other kind of server that might have many users opening new connections to the filer. What is CIFS Max Multiplex?
◦ Check the value of OpLkBkNoBreakAck in "cifs stat". Non-zero numbers indicate oplock break timeouts, which cause performance problem
CIFS has a number of problems, but chattiness has to be the biggest issue by far. By using read ahead and write behinds along with metadata caching CIFS performance can be more than quintupled Here's a performance graph from one network I just happened to have . You'll see NetApp / CIFS see the highest reductions of any application and these numbers are probably low, actually. I've seen CIFS reduction of over 96%.