Subscribe

Snapvault Snapshot network throughput

Hello,

we have a pair of FAS3040 filers (Ontap 8.0.1). One for the data and one for the snapshots. Each of two has 4x GbE interaces which are connected to a Cisco switch in a Port Channel. The filers are located in different datacenters which are connected by 10 GbE. I don't know the exact network topology but according to the network support team there is no network performance bottleneck (they checked the stats and graphs).

What I see during the transfer of a snapshot to the backup filer is a throughput of ~30 MB/s. I've seen CIFS traffic to and from the data filer with >50 MB/s. The cpu load is ø 30%.

Any ideas what to check?

VUMEF004 is the primary system (source)

VUMEF006 the backup system (destination)

VUMEF004> ifgrp status                                              

default: transmit 'IP Load balancing', Ifgrp Type 'multi_mode', fail 'log'

vif0: 4 links, transmit 'IP Load balancing', Ifgrp Type 'multi_mode' fail 'default'

     Ifgrp Status    Up     Addr_set 

    up:

    e4b: state up, since 16Dec2010 16:49:58 (35+19:54:38)

        mediatype: auto-1000t-fd-up

        flags: enabled

        input packets 617954924, input bytes 201018021022

        output packets 18661492, output bytes 1548164538

        up indications 1, broken indications 0

        drops (if) 0, drops (link) 0

        indication: up at 16Dec2010 16:49:58

            consecutive 3095447, transitions 1

    e4a: state up, since 16Dec2010 16:49:58 (35+19:54:38)

        mediatype: auto-1000t-fd-up

        flags: enabled

        input packets 31327899, input bytes 3541567215

        output packets 1003817786, output bytes 1498665173524

        up indications 1, broken indications 0

        drops (if) 0, drops (link) 0

        indication: up at 16Dec2010 16:49:58

            consecutive 3095447, transitions 1

    e3b: state up, since 16Dec2010 16:49:58 (35+19:54:38)

        mediatype: auto-1000t-fd-up

        flags: enabled

        input packets 6763888, input bytes 709795867

        output packets 47740311, output bytes 3829076142

        up indications 1, broken indications 0

        drops (if) 0, drops (link) 0

        indication: up at 16Dec2010 16:49:58

            consecutive 3095447, transitions 1

    e3a: state up, since 16Dec2010 16:49:58 (35+19:54:38)

        mediatype: auto-1000t-fd-up

        flags: enabled

        input packets 7380650, input bytes 614070676

        output packets 5273889, output bytes 7586525498

        up indications 1, broken indications 0

        drops (if) 0, drops (link) 0

        indication: up at 16Dec2010 16:49:58

            consecutive 3095447, transitions 1

VUMEF006>  ifgrp status      
default: transmit 'IP Load balancing', Ifgrp Type 'multi_mode', fail 'log'
vif0: 4 links, transmit 'IP Load balancing', Ifgrp Type 'multi_mode' fail 'default'
     Ifgrp Status    Up     Addr_set 
    up:
    e4b: state up, since 06Jan2011 16:48:56 (14+19:56:40)
        mediatype: auto-1000t-fd-up
        flags: enabled
        input packets 668826329, input bytes 1013962060137
        output packets 5048868, output bytes 590464021
        up indications 1, broken indications 0
        drops (if) 0, drops (link) 0
        indication: up at 06Jan2011 16:48:56
            consecutive 1281303, transitions 1
    e4a: state up, since 06Jan2011 16:48:56 (14+19:56:40)
        mediatype: auto-1000t-fd-up
        flags: enabled
        input packets 2031316, input bytes 133486263
        output packets 60777, output bytes 5758807
        up indications 1, broken indications 0
        drops (if) 0, drops (link) 0
        indication: up at 06Jan2011 16:48:56
            consecutive 1281303, transitions 1
    e3b: state up, since 06Jan2011 16:48:56 (14+19:56:40)
        mediatype: auto-1000t-fd-up
        flags: enabled
        input packets 16087423, input bytes 1783457403
        output packets 224969, output bytes 16381402
        up indications 1, broken indications 0
        drops (if) 0, drops (link) 0
        indication: up at 06Jan2011 16:48:56
            consecutive 1281303, transitions 1
    e3a: state up, since 06Jan2011 16:48:56 (14+19:56:40)
        mediatype: auto-1000t-fd-up
        flags: enabled
        input packets 2303947, input bytes 769575573
        output packets 352882441, output bytes 32534561175
        up indications 1, broken indications 0
        drops (if) 0, drops (link) 0
        indication: up at 06Jan2011 16:48:56
            consecutive 1281303, transitions 1

Re: Snapvault Snapshot network throughput

Is the aggregate full more than 85% full can be a bad place to be for performance.

The command sysstat -x will tell you what is happening in your system.  Is anything maxed out?

You can use the statit command to see what the disks are doing, are they maxed out, any hot spots.

you could use the pktt start / stop command to create a packet trace on both filers.  Use wireshark or another tool to see what is happening to your packets.  TCP windowing issue, MTU problem, retransmits due to inline compression on the WAN?

Hope it helps

Bren

Re: Snapvault Snapshot network throughput

Hi,

the 2 filers are FAS3140 not 3040, I'm always mixing it up. Right now there is inbound traffic to the data server, the Windows client that is writing to the share is connected by a different interface. Additionally I triggered a snapvault to run now. It's the same max. throughput it was this night (30-40 MB/s).

The attached graph shows that the throughput seems to hit some limit, the max. throughput shows only a few % difference over the running time. I can't see much of a performance problem in the sysstat output.

snapvault source


CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
21%      0    544      0     547   35962  38127   33124    340       0      0     6s    40%   11%  :    10%       3      0      0       0      0       0      0
33%      0    478      0     478   31794  38770   36564  53252       0      0     4s    78%   52%  Ff   18%       0      0      0       0      0       0      0
29%      0    518      0     518   34271  39376   40112  60880       0      0     6s    78%  100%  :f   20%       0      0      0       0      0       0      0
24%      0    498      0     498   33063  39318   45412  18284       0      0    24     57%   42%  :    14%       0      0      0       0      0       0      0
20%      0    548      0     548   36234  38758   30856      0       0      0    24     32%    0%  -     8%       0      0      0       0      0       0      0
37%      0    498      0     503   33023  39269   41329  57277       0      0    24     76%   85%  Ff   18%       5      0      0       0      0       0      0
28%      0    494      0     494   32792  39539   35468  57084       0      0     4s    77%  100%  :f   15%       0      0      0       0      0       0      0
23%      0    522      0     522   34603  40132   35548  18608       0      0    24     56%   43%  :    15%       0      0      0       0      0       0      0
30%      0    478      0     478   31638  39340   43476  18976       0      0     5s    62%   20%  Ff   13%       0      0      0       0      0       0      0
30%      0    535      0     535   35432  39820   40060  61767       0      0     5s    81%  100%  :f   19%       0      0      0       0      0       0      0
29%      0    534      0     537   35431  42297   37572  51476       0      0    24     79%   96%  :    17%       3      0      0       0      0       0      0
21%      0    546      0     546   36174  41229   35872     24       0      0    24     39%    0%  -     8%       0      0      0       0      0       0      0
39%      0    492      0     556   32603  37453   33676  57352       0      0     4s    82%   62%  Ff   18%      64      0      0       0      0       0      0
31%      0    518      0     518   34351  38045   42376  61324       0      0     4s    75%  100%  :f   19%       0      0      0       0      0       0      0
23%      0    545      0     545   35997  36496   38548  13780       0      0     4s    59%   34%  :    10%       0      0      0       0      0       0      0
28%      0    498      0     501   32909  36817   33022   3485       0      0     7s    40%   10%  Fn    9%       3      0      0       0      0       0      0
32%      0    517      0     517   34277  39103   42443  67154       0      0     4s    79%  100%  :f   20%       0      0      0       0      0       0      0
29%      0    518      0     518   34322  38960   32792  61304       0      0     9s    79%  100%  :f   17%       0      0      0       0      0       0      0



snapvault destination


CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
26%      0      0      0       0   38315    848    3836  59486       0      0     4s   100%   63%  Ff   12%       0      0      0       0      0       0      0
18%      0      0      0       0   37695    835       8  63256       0      0     4s   100%  100%  :f   10%       0      0      0       0      0       0      0
14%      0      0      0       0   39143    866     204  12792       0      0     4s   100%   30%  :    12%       0      0      0       0      0       0      0
24%      0      0      0       3   38654    855    5348  34448       0      0     4s   100%   40%  Ff   11%       3      0      0       0      0       0      0
18%      0      0      0       0   37276    825    1800  62192       0      0     4s   100%  100%  :f   10%       0      0      0       0      0       0      0
17%      0      0      0       0   38087    843     836  38444       0      0     4s   100%   78%  :    13%       0      0      0       0      0       0      0
22%      0      0      0       0   39770    883    2412  23292       0      0     4s   100%   27%  Ff    7%       0      0      0       0      0       0      0
19%      0      0      0       0   39647    881     692  63520       0      0    26    100%  100%  :f   11%       0      0      0       0      0       0      0
17%      0      0      0       3   38304    848     808  47420       0      0    26    100%   91%  :    20%       3      0      0       0      0       0      0
19%      0      0      0       0   40060    886    1175   7936       0      0     4s   100%    9%  Fn    5%       0      0      0       0      0       0      0
24%      0      0      0       0   42355    936    6378  83141       0      0    26    100%  100%  :f   16%       0      0      0       0      0       0      0
17%      0      0      0       0   38834    859     428  44140       0      0    26    100%   82%  :    14%       0      0      0       0      0       0      0
12%      0      0      0       0   36964    818       0      8       0      0     4s   100%    0%  -     2%       0      0      0       0      0       0      0
26%      0      0      0       5   35451    785    3360  83496       0      0     4s   100%   97%  Ff   16%       5      0      0       0      0       0      0
18%      0      0      0       0   36488    808    1364  52212       0      0     4s   100%   96%  :    14%       0      0      0       0      0       0      0
12%      0      0      0       0   36823    815       0      0       0      0     4s   100%    0%  -     0%       0      0      0       0      0       0      0
26%      0      0      0       0   38536    853    3904  60112       0      0     5s   100%   59%  Ff   13%       0      0      0       0      0       0      0

Re: Snapvault Snapshot network throughput

ah, forgot. The volume is new and only 20% is used.

source

VUMEF004> df -Vh /vol/VUMEF004_nas_vol009/
Filesystem               total       used      avail capacity  Mounted on
/vol/VUMEF004_nas_vol009/     7782GB     1533GB     6248GB      20%  /vol/VUMEF004_nas_vol009/
/vol/VUMEF004_nas_vol009/.snapshot      409GB      878MB      408GB       0%  /vol/VUMEF004_nas_vol009/.snapshot

destiantion

VUMEF006> df -Vh /vol/VUMEF006_svd_VUMEF004_nas_vol009/
Filesystem               total       used      avail capacity  Mounted on
/vol/VUMEF006_svd_VUMEF004_nas_vol009/     7782GB     1094GB     6688GB      14%  /vol/VUMEF006_svd_VUMEF004_nas_vol009/
/vol/VUMEF006_svd_VUMEF004_nas_vol009/.snapshot      409GB     4219MB      405GB       1%  /vol/VUMEF006_svd_VUMEF004_nas_vol009/.snapshot

Re: Snapvault Snapshot network throughput

It is fair to say the system is not maxed out for CPU, Disk I/O or Network.  Looks like you are using 1/2 of 1Gb pipe so it should do more.

Have you confirmed that the snapvault does not have throttling enabled?  snapvault modify filer:/vol/volname/qtree

Bren

Re: Snapvault Snapshot network throughput

Also have a look at the options of both filers for replication throttle

options rep

If there is no throttling it could be a network throughput issue.  Here is how to get a network trace from the filer.

https://kb.netapp.com/support/index?page=content&id=1010155

Good luck

Bren

Re: Snapvault Snapshot network throughput

What is the round trip latency between SnapVault primary and secondary systems?

Re: Snapvault Snapshot network throughput

looks ok to me

VUMEF006> snapvault modify VUMEF006:/vol/VUMEF006_svd_VUMEF004_nas_vol009/xxxxx
No changes in the configuration.
Configuration for qtree /vol/VUMEF006_svd_VUMEF004_nas_vol009/xxxxx is:
/vol/VUMEF006_svd_VUMEF004_nas_vol009/xxxxx source=VUMEF004:/vol/VUMEF004_nas_vol009/xxxxx kbs=unlimited tries=2 back_up_open_files=on,ignore_atime=off,utf8_primary_path=off

Re: Snapvault Snapshot network throughput

VUMEF004> options rep
replication.logical.reserved_transfers 0          (value might be overwritten in takeover)
replication.throttle.enable  off       
replication.throttle.incoming.max_kbs unlimited 
replication.throttle.outgoing.max_kbs unlimited 
replication.volume.reserved_transfers 0          (value might be overwritten in takeover)
replication.volume.use_auto_resync off        (value might be overwritten in takeover)

VUMEF006> options rep
replication.logical.reserved_transfers 0          (value might be overwritten in takeover)
replication.throttle.enable  off       
replication.throttle.incoming.max_kbs unlimited 
replication.throttle.outgoing.max_kbs unlimited 
replication.volume.reserved_transfers 0          (value might be overwritten in takeover)
replication.volume.use_auto_resync off        (value might be overwritten in takeover)

Re: Snapvault Snapshot network throughput

VUMEF004> ping -s vumef006
64 bytes from vumef006.rd.corpintra.net (xx.62.180.179): icmp_seq=0 ttl=250 time=42.132 ms
64 bytes from vumef006.rd.corpintra.net (xx.62.180.179): icmp_seq=1 ttl=250 time=56.589 ms
64 bytes from vumef006.rd.corpintra.net (xx.62.180.179): icmp_seq=2 ttl=250 time=40.971 ms
64 bytes from vumef006.rd.corpintra.net (xx.62.180.179): icmp_seq=3 ttl=250 time=54.194 ms
64 bytes from vumef006.rd.corpintra.net (xx.62.180.179): icmp_seq=4 ttl=250 time=39.315 ms

VUMEF006> ping -s vumef004
64 bytes from vumef004.rd.corpintra.net (xx.60.6.232): icmp_seq=0 ttl=250 time=46.550 ms
64 bytes from vumef004.rd.corpintra.net (xx.60.6.232): icmp_seq=1 ttl=250 time=55.573 ms
64 bytes from vumef004.rd.corpintra.net (xx.60.6.232): icmp_seq=2 ttl=250 time=43.920 ms
64 bytes from vumef004.rd.corpintra.net (xx.60.6.232): icmp_seq=3 ttl=250 time=51.817 ms

I have too look into this again on monday, then I will contact the network people again and our netapp support. Thanks so far!