ONTAP Discussions

Intercluster XDP (SnapVault) performance over 10G

mark_schuren

Hi all,

I'm having performance issues with Snapmirror XDP relationships between two clusters (primary 4-node 3250, secondary 2-node 3220, all running cDOT 8.2P5).

The replication LIFs on both sides are pure 10G (private replication VLAN), flat network (no routing), using jumbo frames (however also tested without jumbo and problem persist).

The vaulting works in general, but average throughput for a single node/relationship never goes beyond 1Gbit/s - most of the time it is even much slower (300-500MBit/s or less).

I verified that neither the source node(s) nor the destination node(s) are CPU or disk bound during the transfers (at least not all of the source nodes).

I also verfied the SnapMirror traffic is definitely going through the 10g interfaces.

There is no compressed volume involved, only dedupe (on source).

The dedupe schedules are not within the same time window.

Also checked the physical network interface counters on switch and netapps, no errors / drops, clean.

However, the replication is SLOW, no matter what i try.

Customer impression is that it got even slower over time, e.g. throughput of a source node was ~1GBit/s when relationship(s) were initialized (not as high as expected), and dropped to ~ 500MBit/s after some months of operation / regular updates...

Meanwhile the daily update (of all relationships) sums up to ~ 1,4 TB per day, and it takes almost the whole night to finish 😞

So the question is: how to tune that?

Is anyone having similar issues regarding Snapmirror XDP throughput over 10Gbit Ethernet?

Are there any configurable parameters (network compression? TCP win size? TCP delayed ack's? Anything I don't think of?) on source/destination side?

Thankful for all ideas / comments,

Mark

18 REPLIES 18

bobshouseofcards

Can't compare to doing XDP relationships over 10G, but I can share my experience running standard Snapmirror (DP) over long distance 10G (1000 mile replication, latency 24ms round trip). 

 

My sources are two 4-node clusters - "small" (2x6240,2x6220) and "big" (2x6280, 2x6290) both replicating to a single 4 node (4x8060) cluster at the target.  The "small" source cluster is mostly high capacity disks (3/4TB), the "big" source cluster is all performance disk (600/900GB).  All replications are to capacity disk (4TB) at the destination end.

 

My prep test was to run volume moves internally on both source clusters.  Volume move for all intents is an internal snapmirror (with obvious bonuses for keeping it live all the time).  I used these tests to set expectations of how fast a snapmirror might function.  At best my small cluster would go just under 1Gbps for volume moves and my big cluster would go closer to 1.5Gbps.  These set the limits for a single replicaiton speed I'd expect.

 

For the replications, I maxed out the WAN receive buffers ahead of time (because of the low latency in the network):

 

network connections*> options buffer show
Service     Layer 4 Protocol Network Receive Buffer Size (KB) Auto-Tune?
----------- ---------------- ------- ------------------------ ----------
ctlopcp     TCP              WAN     7168                     true
ctlopcp     TCP              LAN     256                      false

 

I've found that transfers from capacity disk are very much limited by disk speeds.  I control all replications with scripts because the standard schedules don't work well enough to keep the pipe at maximum speed - replicating updates about 1800 volumes daily.  For the small cluster, I limit current replications to 3 per aggregate, which seems to be as fast as the aggregate will go before disk limitations slow down individual transfers.  The control script starts updates and monitors every 5 minutes to start up new replications in each aggregate.  Even so - the best the "small" cluster has ever achieved is about 5.5Gbps over time, and individual transfers are always less than 1Gbps.  Best I've ever seen on an individual transfer is about 700Mbps.

 

The large cluster just fires on usual schedules without regard for aggregates.  Nodes don't seem to sneeze at it much - the large cluster can easily fill the 10G pipe with multiple transfers, though individual transfers max out near the 1.5Gbps mark similar to the internal move.

 

 

 

We are facing a similar problem, basically SnapMirror/SnapVault is slow, even on 1 Gbps links, with speeds reaching only 10-15% of the available bandwidth In our troubleshooting, we have found that the common ground for this problem to occur is that the intercluster LIF(s) are on etherchannel ports (ifgrp) that are using post-based load balancing. When using IP-based load balancing (which is the default when creating ifgrps) or even working active-passibely we are not seeing this performance degradation. Furthermore if we do use port-based load balancing on the source controller, yet disable all but one physical port, SnapMirror/SnapVault throughput is very good. Note that SnapMirror/SnapVault in cDOT tends to open a multitude of TCP sessions between source and destination system.

 

So I am wondring of any of you facing SnapVault performance issues on 10 Gbps network is using port-based load balancing by any chance ?

NetApp_SEAL

Hi Mark,

As you saw my other posts in a similar thread...

You say it's a flat network (no routing)...Ok. But with SnapMirror, the routing groups / routes themselves on the clusters can be tricky (as I have discovered in a weird issue I was previously having).

Can you output the following?

cluster::> network routing groups show

cluster::> network routing groups route show

Can you output a trace from a source IC LIF to a destination IC LIF as well, please?

Also - check this KB (if you have not already): https://library.netapp.com/ecmdocs/ECMM1278318/html/onlinebk/protecting/task/t_oc_prot_sm-adjust-tcp-window-size.html

Hope I can attempt to help out!

Thanks!
Trey

mark_schuren

Hi, thanks for your info.

I have no gateway (no route) defined in my intercluster routing groups - because these IC lifs are on a flat subnet (low latency) and should only communicate within this local network.

The strange thing is, if I migrate my SOURCE IC LIF to a 1gig port, the traffic flows with constant 110MByte/s.

As soon as I migrate it to a 10gig port, traffic is very peaky and never constant above 100MByte/sec. In average it is even slower... Same network, same Nexus, same transfer...

I have played a lot with window sizes already, but I think in cDOT (at least in the latest releases) this is configured within the cluster scope, not node-scope, as stated here:

https://kb.netapp.com/support/index?page=content&id=1013603

At the moment I have:

ucnlabcm04::*> net connections options buffer show (/modify)

  (network connections options buffer show)

Service      Layer 4 Protocol  Network  Receive Buffer Size (KB) Auto-Tune?

-----------  ----------------  -------  ------------------------ ----------

ctlopcp      TCP               WAN      512                      false

ctlopcp      TCP               LAN      256                      false

The default ("WAN" = intercluster) is 2048 (2MB receive window size) with auto-tuning enabled for WAN.

I've played with different values on the destination. Changing the values may require to reset all TCP connections, e.g. by rebooting or osetting the IC LIF's down for a while.
The configured window sizes are then definitely in effect according to netstat -n.

However my problem remains, throughput is flakey as soon as the source ic lifs are on 10g...

Here is info about my source cluster (at the moment it consists only of nodes 03 and 04) - I am testing with a single volume from node 03 as the source node.

ucnlabcm01::> net port show -role intercluster

  (network port show)

                                      Auto-Negot  Duplex     Speed (Mbps)

Node   Port   Role         Link   MTU Admin/Oper  Admin/Oper Admin/Oper

------ ------ ------------ ---- ----- ----------- ---------- ------------

ucnlabcm01-03

       e0b    intercluster up    1500  true/true  full/full   auto/1000

       e2b    intercluster up    1500  true/true  full/full   auto/10000

ucnlabcm01-04

       e0b    intercluster up    1500  true/true  full/full   auto/1000

ucnlabcm01::> net int show -role intercluster

  (network interface show)

            Logical    Status     Network            Current       Current Is

Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home

----------- ---------- ---------- ------------------ ------------- ------- ----

ucnlabcm01-03

            repl1        up/up    10.230.4.33/16     ucnlabcm01-03 e2b     true

ucnlabcm01-04

            repl1        up/up    10.230.4.34/16     ucnlabcm01-04 e0b     true

ucnlabcm01::> net routing-groups show

Vserver   Group     Subnet          Role         Metric

--------- --------- --------------- ------------ -------

ucnlabcm01

          c10.230.0.0/16

                    10.230.0.0/16   cluster-mgmt      20

ucnlabcm01-03

          c169.254.0.0/16

                    169.254.0.0/16  cluster           30

          i10.230.0.0/16

                    10.230.0.0/16   intercluster      40

          n10.210.0.0/16

                    10.210.0.0/16   node-mgmt         10

ucnlabcm01-04

          c169.254.0.0/16

                    169.254.0.0/16  cluster           30

          i10.230.0.0/16

                    10.230.0.0/16   intercluster      40

          n10.210.0.0/16

                    10.210.0.0/16   node-mgmt         10

...

And here is my destination cluster:

ucnlabcm04::*> net port show -role intercluster

  (network port show)

                                      Auto-Negot  Duplex     Speed (Mbps)

Node   Port   Role         Link   MTU Admin/Oper  Admin/Oper Admin/Oper

------ ------ ------------ ---- ----- ----------- ---------- ------------

ucnlabcm04-01

       e1a-230

                intercluster up    1500  true/true  full/full   auto/10000

       e2d    intercluster up    1500  true/true  full/full   auto/1000

ucnlabcm04::> net int show -role intercluster

  (network interface show)

            Logical    Status     Network            Current       Current Is

Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home

----------- ---------- ---------- ------------------ ------------- ------- ----

ucnlabcm04-01

            repl1        up/up    10.230.4.41/16     ucnlabcm04-01 e1a-230 true

ucnlabcm04::> routing-groups show -vserver ucnlabcm04-01

  (network routing-groups show)

          Routing

Vserver   Group     Subnet          Role         Metric

--------- --------- --------------- ------------ -------

ucnlabcm04-01

          i10.230.0.0/16

                    10.230.0.0/16   intercluster      40

          n10.210.0.0/16

                    10.210.0.0/16   node-mgmt         10

ucnlabcm04::> routing-groups route show -vserver ucnlabcm04-01

  (network routing-groups route show)

          Routing

Vserver   Group     Destination     Gateway         Metric

--------- --------- --------------- --------------- ------

ucnlabcm04-01

          n10.210.0.0/16

                    0.0.0.0/0       10.210.254.254  10

The destination node is a dedicated replication destination

Source LIF running on 1G

ucnlabcm04::> node run -node local sysstat -x 1

CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s

                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out

59%      0      0      0      13  117390   2903    1340  84500       0      0   >60     92%   50%  :    50%      13      0      0       0      0       0      0

64%      0      0      0       0  115171   2866    5680 161280       0      0   >60     94%   85%  H    70%       0      0      0       0      0       0      0

63%      0      0      0       0  119723   2974    5099 158230       0      0   >60     94%   78%  Hf   67%       0      0      0       0      0       0      0

66%      0      0      0       1  116859   2902    5396 160052       0      0   >60     93%   69%  Hf   70%       1      0      0       0      0       0      0

81%      0      0      0    5200  113367   2806    6492 127552       0      0   >60     93%   57%  Hf   55%    5200      0      0       0      0       0      0

86%      0      0      0    7007  115025   2847    5634 107421       0      0   >60     93%   52%  Hf   49%    7007      0      0       0      0       0      0

86%      0      0      0    6533  117410   2925    2653  86653       0      0   >60     92%   70%  :    62%    6529      4      0       0      0       0      0

91%      0      0      0    6943  117531   2920    4888 160884       0      0   >60     93%   71%  H    65%    6943      0      0       0      0       0      0

71%      0      0      0    2056  109902   2863    4931 161846       0      0   >60     94%   69%  H    64%    2056      0      0       0      0       0      0

64%      0      0      0     147  115349   2855    7048 133812       0      0   >60     92%   50%  Hf   47%     147      0      0       0      0       0      0

61%      0      0      0      14  114891   2844    6315 119106       0      0   >60     92%   68%  Hf   55%      14      0      0       0      0       0      0

63%      0      0      0       0  118481   2928    3976  80359       0      0   >60     91%   64%  Hs   51%       0      0      0       0      0       0      0

61%      0      0      0       0  117321   2923    2845 149546       0      0   >60     93%   67%  :    66%       0      0      0       0      0       0      0

62%      0      0      0       0  117185   2913    3992 160355       0      0   >60     93%   69%  H    61%       0      0      0       0      0       0      0

63%      0      0      0       0  115966   2886    6422 152112       0      0   >60     93%   66%  Hf   56%       0      0      0       0      0       0      0

64%      0      0      0      13  118045   2938    7341 130641       0      0   >60     93%   63%  Hf   55%      13      0      0       0      0       0      0

62%      0      0      0       0  117927   2933    4692 136488       0      0   >60     93%   72%  Hf   60%       0      0      0       0      0       0      0

62%      0      0      0     109  119707   2967    2830  95054       0      0   >60     92%   51%  Hs   58%     109      0      0       0      0       0      0

63%      0      0      0     305  117229   3072    2245 129655       0      0   >60     94%   61%  :    62%     305      0      0       0      0       0      0

Source LIF Running on 10G

ucnlabcm04::> node run -node local sysstat -x 1

CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s

                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out

14%      0      0      0     104   22012    588     836  39936       0      0   >60     90%  100%  :f   33%     104      0      0       0      0       0      0

16%      0      0      0     166   20780    564     995  35692       0      0   >60     91%  100%  :f   34%     166      0      0       0      0       0      0

21%      0      0      0       2   38318   1021     788  29900       0      0   >60     90%   80%  :    36%       2      0      0       0      0       0      0

27%      0      0      0       0   33097    886    1476      0       0      0   >60     87%    0%  -     6%       0      0      0       0      0       0      0

19%      0      0      0     143   22712    632   10713  52947       0      0   >60     90%   84%  Hf   43%     143      0      0       0      0       0      0

14%      0      0      0      20    2645     70     964  50608       0      0   >60     87%  100%  :f   40%      20      0      0       0      0       0      0

  8%      0      0      0       1      17      1    1176  52596       0      0   >60     90%  100%  :v   50%       1      0      0       0      0       0      0

18%      0      0      0       1   30655    795     880     12       0      0   >60     88%    5%  :    11%       1      0      0       0      0       0      0

20%      0      0      0       1   23746    663    1144      0       0      0   >60     88%    0%  -     8%       1      0      0       0      0       0      0

23%      0      0      0       3   29419    762    1340      0       0      0   >60     88%    0%  -     9%       3      0      0       0      0       0      0

30%      0      0      0      16   81171   2189    8796  41272       0      0   >60     92%   36%  Hf   31%      16      0      0       0      0       0      0

19%      0      0      0       1   23010    610    1168 114004       0      0   >60     90%   83%  :    67%       1      0      0       0      0       0      0

21%      0      0      0       1   22341    630    1412      0       0      0   >60     88%    0%  -    13%       1      0      0       0      0       0      0

17%      0      0      0       0   14863    422     932      0       0      0   >60     88%    0%  -     8%       0      0      0       0      0       0      0

27%      0      0      0       0   19099    499    1492      0       0      0   >60     87%    0%  -     7%       0      0      0       0      0       0      0

21%      0      0      0      16   29642    765    6504  94544       0      0   >60     91%   90%  Hf   55%      16      0      0       0      0       0      0

16%      0      0      0       1   19368    568    3080  61488       0      0   >60     89%   96%  :    63%       1      0      0       0      0       0      0

21%      0      0      0       0    1121     29    1363      0       0      0   >60     86%    0%  -     8%       0      0      0       0      0       0      0

26%      0      0      0       0   65267   1753    1380      0       0      0   >60     89%    0%  -    13%       0      0      0       0      0       0      0

18%      0      0      0     143   17263    490    1036      0       0      0   >60     88%    0%  -     6%     143      0      0       0      0       0      0

I just don't get it.

NetApp_SEAL

And as a follow-up for additional information (after validating with another resource):

- Routing groups define rules for all traffic, not just layer 3

- Routing group routes are specific to layer 3

- The default gateway of a cluster is defined by the cluster management LIF

- The default gateway of a node is defined by the node management LIF

So here, if you have the option of moving the cluster management LIF to another VLAN (can it go on the same network as the node management LIFs?), that might be a worthy step in testing.

mark_schuren

Here is all info from my source cluster (info regarding data-vservers was stripped):

ucnlabcm01::> routing-groups show

  (network routing-groups show)

          Routing

Vserver   Group     Subnet          Role         Metric

--------- --------- --------------- ------------ -------

ucnlabcm01

          c10.230.0.0/16

                    10.230.0.0/16   cluster-mgmt      20

ucnlabcm01-03

          c169.254.0.0/16

                    169.254.0.0/16  cluster           30

          i10.230.0.0/16

                    10.230.0.0/16   intercluster      40

          n10.210.0.0/16

                    10.210.0.0/16   node-mgmt         10

ucnlabcm01-04

          c169.254.0.0/16

                    169.254.0.0/16  cluster           30

          i10.230.0.0/16

                    10.230.0.0/16   intercluster      40

          n10.210.0.0/16

                    10.210.0.0/16   node-mgmt         10

ucnlabcm01::> routing-groups route show

  (network routing-groups route show)

          Routing

Vserver   Group     Destination     Gateway         Metric

--------- --------- --------------- --------------- ------

ucnlabcm01

          c10.230.0.0/16

                    0.0.0.0/0       10.230.254.254  20

ucnlabcm01-03

          n10.210.0.0/16

                    0.0.0.0/0       10.210.254.254  10

ucnlabcm01-04

          n10.210.0.0/16

                    0.0.0.0/0       10.210.254.254  10

ucnlabcm01::> run -node ucnlabcm01-03 route -gsn

Routing tables

Routing group: __default_grp

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

127.0.0.1        127.0.0.1          UH          0        0  lo                 

127.0.10.1       127.0.20.1         UHS         4  3072302  losk               

Routing group: ucnlabcm01-03_c169.254.0.0/16

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

169.254          link#3             UC          0        0  e1a                

169.254.77.92    90:e2:ba:3d:e3:14  UHL         3  2237583  lo                 

169.254.175.117  90:e2:ba:2b:38:b8  UHL        87 56775566  e1a                

169.254.200.34   90:e2:ba:2b:36:54  UHL         0    75520  lo                 

169.254.204.211  90:e2:ba:37:51:f4  UHL        93 52139103  e1a                

Routing group: ucnlabcm01-03_n10.210.0.0/16

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

default          10.210.254.254     UGS         0   352399  e0M                

10.210/16        link#13            UC          0        0  e0M                

10.210.1.7       0:a0:98:1a:82:bb   UHL         0     6624  e0M                

10.210.4.11      link#13            UHRL        0     3983  e0M                

10.210.104.14    0:a0:98:13:d2:4    UHL         0        1  e0M                

10.210.254.1     54:7f:ee:b9:22:bc  UHL         0        0  e0M                

10.210.254.2     54:7f:ee:bb:60:3c  UHL         0        0  e0M                

10.210.254.6     0:12:1:72:6e:ff    UHL         0      984  e0M                

10.210.254.41    0:a0:98:e7:3b:bc   UHL         0  1384574  e0M                

10.210.254.42    0:a0:98:e7:39:12   UHL         0  1330125  e0M                

10.210.254.254   0:0:c:9f:f0:d2     UHL         1        0  e0M                

Routing group: ucnlabcm01-03_i10.230.0.0/16

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

10.230/16        link#6             UC          0        0  e2b                

10.230.4.41      0:c0:dd:26:3:5c    UHL        28    36779  e2b                

10.230.254.1     54:7f:ee:b9:22:bc  UHL         0        0  e2b                

10.230.254.2     54:7f:ee:bb:60:3c  UHL         0        0  e2b 

Destination (ucnlabcm04 is a single node cluster):

ucnlabcm04::> routing-groups show

  (network routing-groups show)

          Routing

Vserver   Group     Subnet          Role         Metric

--------- --------- --------------- ------------ -------

ucnlabcm04

          c10.230.0.0/16

                    10.230.0.0/16   cluster-mgmt      20

ucnlabcm04-01

          i10.230.0.0/16

                    10.230.0.0/16   intercluster      40

          n10.210.0.0/16

                    10.210.0.0/16   node-mgmt         10

ucnlabcm04::> routing-groups route show

  (network routing-groups route show)

          Routing

Vserver   Group     Destination     Gateway         Metric

--------- --------- --------------- --------------- ------

ucnlabcm04

          c10.230.0.0/16

                    0.0.0.0/0       10.230.254.254  20

ucnlabcm04-01

          n10.210.0.0/16

                    0.0.0.0/0       10.210.254.254  10

ucnlabcm04::> run -node ucnlabcm04-01 route -gsn

Routing tables

Routing group: __default_grp

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

127.0.0.1        127.0.0.1          UH          0        0  lo                 

127.0.10.1       127.0.20.1         UHS         4    20424  losk               

Routing group: ucnlabcm04-01_n10.210.0.0/16

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

default          10.210.254.254     UGS         0   119508  e0M                

10.210/16        link#11            UC          0        0  e0M                

10.210.254.1     54:7f:ee:b9:22:bc  UHL         0        0  e0M                

10.210.254.2     54:7f:ee:bb:60:3c  UHL         0        0  e0M                

10.210.254.6     0:12:1:72:6e:ff    UHL         0      116  e0M                

10.210.254.254   link#11            UHL         1        0  e0M                

Routing group: ucnlabcm04_c10.230.0.0/16

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

default          10.230.254.254     UGS         0  2011735  e1b-230            

10.230/16        link#20            UC          0        0  e1b-230            

10.230.3.21      0:50:56:b7:7e:85   UHL         0        0  e1b-230            

10.230.3.120     0:25:b5:0:1:bf     UHL         0     2984  e1b-230            

10.230.3.121     0:50:56:84:4:24    UHL         0       13  e1b-230            

10.230.3.126     0:50:56:84:5c:df   UHL         0        0  e1b-230            

10.230.3.129     0:50:56:84:1:33    UHL         0   351729  e1b-230            

10.230.3.134     0:50:56:84:72:86   UHL         0    98431  e1b-230            

10.230.3.151     0:50:56:84:24:6d   UHL         0      932  e1b-230            

10.230.3.152     0:50:56:84:24:6b   UHL         0      612  e1b-230            

10.230.81.201    0:50:56:b7:6c:a7   UHL         0        0  e1b-230            

10.230.81.202    0:50:56:b7:11:36   UHL         0        0  e1b-230            

10.230.254.1     54:7f:ee:b9:22:bc  UHL         0        0  e1b-230            

10.230.254.2     54:7f:ee:bb:60:3c  UHL         0        0  e1b-230            

10.230.254.254   0:0:c:9f:f0:e6     UHL         1        0  e1b-230            

Routing group: ucnlabclu04-01_i10.230.0.0/16

Internet:

Destination      Gateway            Flags     Refs     Use  Interface          

10.230/16        link#17            UC          0        0  e1a-230            

10.230.4.33      90:e2:ba:3d:e3:15  UHL        29 224063345  e1a-230            

10.230.4.34      0:a0:98:13:d1:ff   UHL        28    18292  e1a-230     

I don't see any gateway involved for any snapmirror transfer. The intercluster routing groups do not contain any gateway...

The cluster-management LIF is in the same "server" VLAN as the intercluster LIFs - the "device mgmt" VLAN is only for the "slow" 100MBit SP and e0M ports.

I *could* move it but I don't see a real reason behind it (according to the above outputs)...?

NetApp_SEAL

Ok, cool. So yeah, I get the gateway deal, and that's fine. Working as expected.

What PORT is the cluster management LIF currently on? I know it's on the same VLAN, but what port specifically?

Also - is there the possibility to move the IC LIFs to a 10 Gb ifgrp on the source (similar to how the destination is set up)?

mark_schuren

It is currently on port e0a. I just moved that to e2b (same port as IC lif) and started a new snapmirror relationship. Does not change anything in the behaviour.

However when I put the IC lif onto a 1GbE port (e0a or any other 1G port in the correct VLAN), the traffic immediately flows smoothly at 1GBit/s. As soon as I put it back to a 10g port the traffic is peaky and in average BELOW 1 GBit/s...

We meanwhile have a second support case opened for this issue (for a customer site where management and intercluster are truely different VLANs). So I'm obivously not the only one

However, thanks a lot for your inputs!

NetApp_SEAL

You're welcome. Indeed it seems like this is part of a larger issue that needs to be addressed.

Perhaps there's not enough folks trying to do SnapMirror / SnapVault over 10 Gb links yet?

Please keep me updated with your interactions with Support. I can assume that they're going to ask a LOT of what has already been covered in this thread, no?

I'll continue to test things on my end with another client and report back here with anything new.

mark_schuren

Oh yes they asked a LOT - I uploaded tons of traces...

But finally we narrowed it down - in my case my destination is losing ethernet frames!

There are frame drops within the destination NIC it seems - no matter which NIC I have the IC LIF sitting on...

These frames are seen on the Nexus, but not in a pktt on the destination, causing DUP ACKs and TCP retransmits (which also do never hit the destination properly)!

This is true for snapmirror traffic ONLY! Performance tests with NFS and iSCSI do not show this behaviour (no drops, no retransmits).

The issue is now at Netapp engineering for finding / fixing the root cause.

However, I have a very good workaround (which speeds up TCP retransmits massively):

1. On the SOURCE node(s) set options ip.tcp.rtt_min to a very low value (e.g. 10 or lower)

2. On the DESTINATION node(s) set options ip.tcp.delack.enable off

3. Reset all intercluster TCP connections (e.g. by down/up the destination's intercluster LIF(s), or reboot one affected node, or whatever helps resetting TCP connections)

Now I at least get constant throughput ~250MByte/sec, and am finally maxing out my disks on the destination - constantly!

Until the root cause is fixed, I can now definitely live with it

Might be interesting for others, so I post my findings here The issue has been observed when a FAS32xx is replicating to another FAS32xx cluster over 10g (low latency network) only. Other combinations or models are probably unaffected.

MK

Thank you thank you thank you! 

 

I have been pulling my hair out for the past few weeks trying to figure this exact issue out.  We are replicating from a FAS8020 to a FAS3210 via 10G ports via Appliance ports ona UCS FI.  Each filer is directly connected to the same FI and on a non-routed private VLAN.  Same exact issue.  The snapmirror would run like a dog on the 10G links, even though NFS was working without issue on the same ports.  Putting in your suggested tweaks has improved the replication from 50Mb/sec to 1900Mb/sec! 🙂

 

NetApp_SEAL

Lots of great info.


So - am I reading that correctly in that your Intercluster LIFs are on the same subnet as your cluster management LIF?

What interface are you using for the cluster management LIF?


Can you output the routing groups and routes for both clusters, please?

(I saw in the previous output that you included the routing groups for source cluster, but no routes. For the destination cluster, only the routing groups and routes for that one node are shown. Just trying to get a solid comparison, that's all).

Also - can you give the output for the following command?

node run -node <node> route -gsn

cDOT routing groups handle all the traffic regardless if you're working with layer 2 or layer 3. These rules apply based on the routing group configuration within the cluster and its nodes. It's only when you're working with layer 3 that you need a specific routing group route.

So, let's say your cluster management LIF does, in fact, happen to be on a 1 Gb port. Your cluster management LIF belongs to routing group c10.230.0.0/16. Your Intercluster LIFs belong to routing group i10.230.0.0/16. Those routing groups are most likely sharing the same default gateway, and that could lead to traffic routing over the cluster management LIF.

Also - am I reading the output correctly in that you seem to be replicating over different port configuration types? Your source shows the LIF on port e2b (assuming an access port) and your destination shows e1a-230, which is a tagged VLAN over an ifgrp (assuming a trunk port). I've seen an issue before where this definitely leads to issues but haven't had a chance to dig back into it. The result was to move the Intercluster traffic back to dedicated 1 Gb ports for more consistent traffic flow. I hope to re-visit this specific issue later in the week to troubleshoot like I did with the one I posted about in the other thread.

I know that some of this might seem like grasping at straws, but can't hurt, right?

TWIELGOS2

We have been having this problem for months with plain old snapmirror.  We have a 10G connection between two different clusters, and snapmirror performance has been calculated at around 100Mb/sec - completely unacceptable.

We disabled reallocate per bug 768028, no help.  We disabled throttling, and that helped, but not enough - and as a debug flag, disabling throttling comes with a production performance cost.

We used this to disable the throttle:

    • node run local -command "priv set diag; setflag repl_throttle_enable 0;"

mark_schuren

Thanks for the tips.

I tried both settings (disable free-space realloc on source and destination aggrs, as well as setflag repl_throttle_enable 0 on source and destination nodes), but this did not make things better, maybe slightly but not really.

I meanwhile experimented a bit more, migrated all intercluster interfaces of my source nodes to gigabit ports (instead of vlan-tagged 10gig interfaces).

Although unexpected this helped quite a lot! Doubled my overall throughput by going to slower NICs on the source side.

Next step is finally open a performance case

officeworks

It smells like flow control on the 10Gb links to me..

ask the network guys if they see alot of RX/TX pause packets on the switch ports.. may as well check for switch port errors while you are at it.. or work with the network guys to see if there are a lot of retransmits

this is the situation where I would have hoped netapp would have network performance tools like iperf so we can validate infrastructure and throughput when the system is put in.

mark_schuren

Checked that already. Switch stats look clean (don't have them anymore).

Ifstat on netapp side looks good:

-- interface  e1b  (54 days, 8 hours, 12 minutes, 26 seconds) --

RECEIVE

Frames/second:    3690  | Bytes/second:    18185k | Errors/minute:       0

Discards/minute:     0  | Total frames:    30935m | Total bytes:     54604g

Total errors:        0  | Total discards:      0  | Multi/broadcast:     7

No buffers:          0  | Non-primary u/c:     0  | Tag drop:            0

Vlan tag drop:       0  | Vlan untag drop:     0  | Vlan forwards:       0

Vlan broadcasts:     0  | Vlan unicasts:       0  | CRC errors:          0

Runt frames:         0  | Fragment:            0  | Long frames:         0

Jabber:              0  | Bus overruns:        0  | Queue drop:          0

Xon:                 0  | Xoff:                0  | Jumbo:               0

TRANSMIT

Frames/second:    3200  | Bytes/second:    12580k | Errors/minute:       0

Discards/minute:     0  | Total frames:    54267m | Total bytes:       343t

Total errors:        0  | Total discards:      0  | Multi/broadcast: 78699

Queue overflows:     0  | No buffers:          0  | Xon:                 6

Xoff:              100  | Jumbo:            3285m | Pktlen:              0

Timeout:             0  | Timeout1:            0

LINK_INFO

Current state:       up | Up to downs:         2  | Speed:           10000m

Duplex:            full | Flowcontrol:       full

So there are some Xon/XOff packets, but very few.

The very same link (same node) transports NFS I/O with 500MByte/sec (via different LIF), so I don't think the interface has any problems. However, Snapmirror average throughput remains below 50 MByte/sec, no matter what I try.

officeworks

When we had vol move issues between cluster nodes.. (via 10Gb) we hit BUG 768028 after opening a performance case... this may or may not be related to what you see... this also impacted snapmirror relationships

http://support.netapp.com/NOW/cgi-bin/bugrellist?bugno=768028

workaround was disabling redirect scanners from running OR, you can disable the free_space_realloc completely.

storage aggregate modify -aggregate <aggr_name> -free-space-realloc no_redirect



mark_schuren

Noone?

Is anyone using SnapMirror XDP actually? In a 10gE environment?

What throughput do you see?

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public