Subscribe

Slow NFS performance with big files on 10Gb link

Hi,
I am trying to test NFS performance on big files (between 0.5 and 4 Go) using a single server and a NetApp filer connected to a switch with 10Gb links (nobody else is using the server and filer).

Test server:
- Dell PowerEdge R320 (CPU: Xeon CPU E5-2407 0 @ 2.20GHz, RAM: 16Go)
- Centos 7.4.1708
- local SSD disk

NetApp filer:
- FAS 2554 (24 FSAS disks)
- CDOT 9.3RC1
- 2 aggregates of 11 disks
- 1 FlexGroup volume writing to the 2 aggregates in the same time

A very simple test (copy data from local SSD disk to filer) shows the following results:

# umount /ssd1 ; umount /mnt/filer/vol_test ; mount /ssd1 ; mount /mnt/filer/vol_test
# time cp -r /ssd1/data /mnt/filer/vol_test/data
real 2m3.690s
user 0m0.031s
sys 0m19.842s
# du -sk /ssd1/data
14186364 /ssd1/data
-> 112 MB/s

This seems to be quite slow. How could I debug this ?

The default configuration is used except (thanks to this thread https://community.netapp.com/t5/Network-Storage-Protocols-Discussions/NFS-performance-on-10Gb-link/td-p/26036):
- rsize & wsize mount options:
rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=x.x.x.x,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=x.x.x.x
(tcp-max-xfer-size parameter has been set to 1048576 on the filer instead of default value of 65536)

- /etc/sysctl.conf
net.core.rmem_default=524288
net.core.wmem_default=524288
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 524288 16777216
net.ipv4.tcp_wmem=4096 524288 16777216
net.ipv4.ipfrag_high_thresh=524288
net.ipv4.ipfrag_low_thresh=393216
net.ipv4.tcp_timestamps=0
net.ipv4.tcp_window_scaling=1
net.core.optmem_max=524287
net.core.netdev_max_backlog=2500
sunrpc.tcp_slot_table_entries=128
sunrpc.udp_slot_table_entries=128


Thanks,
Mathieu