Re: Intercluster XDP (SnapVault) performance over 10G

mark_schuren · ‎2014-01-15

Hi all,

I'm having performance issues with Snapmirror XDP relationships between two clusters (primary 4-node 3250, secondary 2-node 3220, all running cDOT 8.2P5).

The replication LIFs on both sides are pure 10G (private replication VLAN), flat network (no routing), using jumbo frames (however also tested without jumbo and problem persist).

The vaulting works in general, but average throughput for a single node/relationship never goes beyond 1Gbit/s - most of the time it is even much slower (300-500MBit/s or less).

I verified that neither the source node(s) nor the destination node(s) are CPU or disk bound during the transfers (at least not all of the source nodes).

I also verfied the SnapMirror traffic is definitely going through the 10g interfaces.

There is no compressed volume involved, only dedupe (on source).

The dedupe schedules are not within the same time window.

Also checked the physical network interface counters on switch and netapps, no errors / drops, clean.

However, the replication is SLOW, no matter what i try.

Customer impression is that it got even slower over time, e.g. throughput of a source node was ~1GBit/s when relationship(s) were initialized (not as high as expected), and dropped to ~ 500MBit/s after some months of operation / regular updates...

Meanwhile the daily update (of all relationships) sums up to ~ 1,4 TB per day, and it takes almost the whole night to finish 😞

So the question is: how to tune that?

Is anyone having similar issues regarding Snapmirror XDP throughput over 10Gbit Ethernet?

Are there any configurable parameters (network compression? TCP win size? TCP delayed ack's? Anything I don't think of?) on source/destination side?

Thankful for all ideas / comments,

Mark

mark_schuren · ‎2014-01-29

Noone?

Is anyone using SnapMirror XDP actually? In a 10gE environment?

What throughput do you see?

officeworks · ‎2014-02-19

When we had vol move issues between cluster nodes.. (via 10Gb) we hit BUG 768028 after opening a performance case... this may or may not be related to what you see... this also impacted snapmirror relationships

http://support.netapp.com/NOW/cgi-bin/bugrellist?bugno=768028

workaround was disabling redirect scanners from running OR, you can disable the free_space_realloc completely.

storage aggregate modify -aggregate <aggr_name> -free-space-realloc no_redirect

TWIELGOS2 · ‎2014-02-21

We have been having this problem for months with plain old snapmirror. We have a 10G connection between two different clusters, and snapmirror performance has been calculated at around 100Mb/sec - completely unacceptable.

We disabled reallocate per bug 768028, no help. We disabled throttling, and that helped, but not enough - and as a debug flag, disabling throttling comes with a production performance cost.

We used this to disable the throttle:

node run local -command "priv set diag; setflag repl_throttle_enable 0;"

mark_schuren · ‎2014-02-24

Thanks for the tips.

I tried both settings (disable free-space realloc on source and destination aggrs, as well as setflag repl_throttle_enable 0 on source and destination nodes), but this did not make things better, maybe slightly but not really.

I meanwhile experimented a bit more, migrated all intercluster interfaces of my source nodes to gigabit ports (instead of vlan-tagged 10gig interfaces).

Although unexpected this helped quite a lot! Doubled my overall throughput by going to slower NICs on the source side.

Next step is finally open a performance case

officeworks · ‎2014-02-24

It smells like flow control on the 10Gb links to me..

ask the network guys if they see alot of RX/TX pause packets on the switch ports.. may as well check for switch port errors while you are at it.. or work with the network guys to see if there are a lot of retransmits

this is the situation where I would have hoped netapp would have network performance tools like iperf so we can validate infrastructure and throughput when the system is put in.

mark_schuren · ‎2014-02-25

Checked that already. Switch stats look clean (don't have them anymore).

Ifstat on netapp side looks good:

-- interface e1b (54 days, 8 hours, 12 minutes, 26 seconds) --

RECEIVE

Frames/second: 3690 | Bytes/second: 18185k | Errors/minute: 0

Discards/minute: 0 | Total frames: 30935m | Total bytes: 54604g

Total errors: 0 | Total discards: 0 | Multi/broadcast: 7

No buffers: 0 | Non-primary u/c: 0 | Tag drop: 0

Vlan tag drop: 0 | Vlan untag drop: 0 | Vlan forwards: 0

Vlan broadcasts: 0 | Vlan unicasts: 0 | CRC errors: 0

Runt frames: 0 | Fragment: 0 | Long frames: 0

Jabber: 0 | Bus overruns: 0 | Queue drop: 0

Xon: 0 | Xoff: 0 | Jumbo: 0

TRANSMIT

Frames/second: 3200 | Bytes/second: 12580k | Errors/minute: 0

Discards/minute: 0 | Total frames: 54267m | Total bytes: 343t

Total errors: 0 | Total discards: 0 | Multi/broadcast: 78699

Queue overflows: 0 | No buffers: 0 | Xon: 6

Xoff: 100 | Jumbo: 3285m | Pktlen: 0

Timeout: 0 | Timeout1: 0

LINK_INFO

Current state: up | Up to downs: 2 | Speed: 10000m

Duplex: full | Flowcontrol: full

So there are some Xon/XOff packets, but very few.

The very same link (same node) transports NFS I/O with 500MByte/sec (via different LIF), so I don't think the interface has any problems. However, Snapmirror average throughput remains below 50 MByte/sec, no matter what I try.

NetApp_SEAL · ‎2014-09-02

Hi Mark,

As you saw my other posts in a similar thread...

You say it's a flat network (no routing)...Ok. But with SnapMirror, the routing groups / routes themselves on the clusters can be tricky (as I have discovered in a weird issue I was previously having).

Can you output the following?

cluster::> network routing groups show

cluster::> network routing groups route show

Can you output a trace from a source IC LIF to a destination IC LIF as well, please?

Also - check this KB (if you have not already): https://library.netapp.com/ecmdocs/ECMM1278318/html/onlinebk/protecting/task/t_oc_prot_sm-adjust-tcp-window-size.html

Hope I can attempt to help out!

Thanks!
Trey

mark_schuren · ‎2014-09-04

Hi, thanks for your info.

I have no gateway (no route) defined in my intercluster routing groups - because these IC lifs are on a flat subnet (low latency) and should only communicate within this local network.

The strange thing is, if I migrate my SOURCE IC LIF to a 1gig port, the traffic flows with constant 110MByte/s.

As soon as I migrate it to a 10gig port, traffic is very peaky and never constant above 100MByte/sec. In average it is even slower... Same network, same Nexus, same transfer...

I have played a lot with window sizes already, but I think in cDOT (at least in the latest releases) this is configured within the cluster scope, not node-scope, as stated here:

https://kb.netapp.com/support/index?page=content&id=1013603

At the moment I have:

ucnlabcm04::*> net connections options buffer show (/modify)

(network connections options buffer show)

Service Layer 4 Protocol Network Receive Buffer Size (KB) Auto-Tune?

----------- ---------------- ------- ------------------------ ----------

ctlopcp TCP WAN 512 false

ctlopcp TCP LAN 256 false

The default ("WAN" = intercluster) is 2048 (2MB receive window size) with auto-tuning enabled for WAN.

I've played with different values on the destination. Changing the values may require to reset all TCP connections, e.g. by rebooting or osetting the IC LIF's down for a while.
The configured window sizes are then definitely in effect according to netstat -n.

However my problem remains, throughput is flakey as soon as the source ic lifs are on 10g...

Here is info about my source cluster (at the moment it consists only of nodes 03 and 04) - I am testing with a single volume from node 03 as the source node.

ucnlabcm01::> net port show -role intercluster

(network port show)

Auto-Negot Duplex Speed (Mbps)

Node Port Role Link MTU Admin/Oper Admin/Oper Admin/Oper

------ ------ ------------ ---- ----- ----------- ---------- ------------

ucnlabcm01-03

e0b intercluster up 1500 true/true full/full auto/1000

e2b intercluster up 1500 true/true full/full auto/10000

ucnlabcm01-04

e0b intercluster up 1500 true/true full/full auto/1000

ucnlabcm01::> net int show -role intercluster

(network interface show)

Logical Status Network Current Current Is

Vserver Interface Admin/Oper Address/Mask Node Port Home

----------- ---------- ---------- ------------------ ------------- ------- ----

ucnlabcm01-03

repl1 up/up 10.230.4.33/16 ucnlabcm01-03 e2b true

ucnlabcm01-04

repl1 up/up 10.230.4.34/16 ucnlabcm01-04 e0b true

ucnlabcm01::> net routing-groups show

Vserver Group Subnet Role Metric

--------- --------- --------------- ------------ -------

ucnlabcm01

c10.230.0.0/16

10.230.0.0/16 cluster-mgmt 20

ucnlabcm01-03

c169.254.0.0/16

169.254.0.0/16 cluster 30

i10.230.0.0/16

10.230.0.0/16 intercluster 40

n10.210.0.0/16

10.210.0.0/16 node-mgmt 10

ucnlabcm01-04

c169.254.0.0/16

169.254.0.0/16 cluster 30

i10.230.0.0/16

10.230.0.0/16 intercluster 40

n10.210.0.0/16

10.210.0.0/16 node-mgmt 10

...

And here is my destination cluster:

ucnlabcm04::*> net port show -role intercluster

(network port show)

Auto-Negot Duplex Speed (Mbps)

Node Port Role Link MTU Admin/Oper Admin/Oper Admin/Oper

------ ------ ------------ ---- ----- ----------- ---------- ------------

ucnlabcm04-01

e1a-230

intercluster up 1500 true/true full/full auto/10000

e2d intercluster up 1500 true/true full/full auto/1000

ucnlabcm04::> net int show -role intercluster

(network interface show)

Logical Status Network Current Current Is

Vserver Interface Admin/Oper Address/Mask Node Port Home

----------- ---------- ---------- ------------------ ------------- ------- ----

ucnlabcm04-01

repl1 up/up 10.230.4.41/16 ucnlabcm04-01 e1a-230 true

ucnlabcm04::> routing-groups show -vserver ucnlabcm04-01

(network routing-groups show)

Routing

Vserver Group Subnet Role Metric

--------- --------- --------------- ------------ -------

ucnlabcm04-01

i10.230.0.0/16

10.230.0.0/16 intercluster 40

n10.210.0.0/16

10.210.0.0/16 node-mgmt 10

ucnlabcm04::> routing-groups route show -vserver ucnlabcm04-01

(network routing-groups route show)

Routing

Vserver Group Destination Gateway Metric

--------- --------- --------------- --------------- ------

ucnlabcm04-01

n10.210.0.0/16

0.0.0.0/0 10.210.254.254 10

The destination node is a dedicated replication destination

Source LIF running on 1G

ucnlabcm04::> node run -node local sysstat -x 1

CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s

in out read write read write age hit time ty util in out in out

59% 0 0 0 13 117390 2903 1340 84500 0 0 >60 92% 50% : 50% 13 0 0 0 0 0 0

64% 0 0 0 0 115171 2866 5680 161280 0 0 >60 94% 85% H 70% 0 0 0 0 0 0 0

63% 0 0 0 0 119723 2974 5099 158230 0 0 >60 94% 78% Hf 67% 0 0 0 0 0 0 0

66% 0 0 0 1 116859 2902 5396 160052 0 0 >60 93% 69% Hf 70% 1 0 0 0 0 0 0

81% 0 0 0 5200 113367 2806 6492 127552 0 0 >60 93% 57% Hf 55% 5200 0 0 0 0 0 0

86% 0 0 0 7007 115025 2847 5634 107421 0 0 >60 93% 52% Hf 49% 7007 0 0 0 0 0 0

86% 0 0 0 6533 117410 2925 2653 86653 0 0 >60 92% 70% : 62% 6529 4 0 0 0 0 0

91% 0 0 0 6943 117531 2920 4888 160884 0 0 >60 93% 71% H 65% 6943 0 0 0 0 0 0

71% 0 0 0 2056 109902 2863 4931 161846 0 0 >60 94% 69% H 64% 2056 0 0 0 0 0 0

64% 0 0 0 147 115349 2855 7048 133812 0 0 >60 92% 50% Hf 47% 147 0 0 0 0 0 0

61% 0 0 0 14 114891 2844 6315 119106 0 0 >60 92% 68% Hf 55% 14 0 0 0 0 0 0

63% 0 0 0 0 118481 2928 3976 80359 0 0 >60 91% 64% Hs 51% 0 0 0 0 0 0 0

61% 0 0 0 0 117321 2923 2845 149546 0 0 >60 93% 67% : 66% 0 0 0 0 0 0 0

62% 0 0 0 0 117185 2913 3992 160355 0 0 >60 93% 69% H 61% 0 0 0 0 0 0 0

63% 0 0 0 0 115966 2886 6422 152112 0 0 >60 93% 66% Hf 56% 0 0 0 0 0 0 0

64% 0 0 0 13 118045 2938 7341 130641 0 0 >60 93% 63% Hf 55% 13 0 0 0 0 0 0

62% 0 0 0 0 117927 2933 4692 136488 0 0 >60 93% 72% Hf 60% 0 0 0 0 0 0 0

62% 0 0 0 109 119707 2967 2830 95054 0 0 >60 92% 51% Hs 58% 109 0 0 0 0 0 0

63% 0 0 0 305 117229 3072 2245 129655 0 0 >60 94% 61% : 62% 305 0 0 0 0 0 0

Source LIF Running on 10G

ucnlabcm04::> node run -node local sysstat -x 1

CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s

in out read write read write age hit time ty util in out in out

14% 0 0 0 104 22012 588 836 39936 0 0 >60 90% 100% :f 33% 104 0 0 0 0 0 0

16% 0 0 0 166 20780 564 995 35692 0 0 >60 91% 100% :f 34% 166 0 0 0 0 0 0

21% 0 0 0 2 38318 1021 788 29900 0 0 >60 90% 80% : 36% 2 0 0 0 0 0 0

27% 0 0 0 0 33097 886 1476 0 0 0 >60 87% 0% - 6% 0 0 0 0 0 0 0

19% 0 0 0 143 22712 632 10713 52947 0 0 >60 90% 84% Hf 43% 143 0 0 0 0 0 0

14% 0 0 0 20 2645 70 964 50608 0 0 >60 87% 100% :f 40% 20 0 0 0 0 0 0

8% 0 0 0 1 17 1 1176 52596 0 0 >60 90% 100% :v 50% 1 0 0 0 0 0 0

18% 0 0 0 1 30655 795 880 12 0 0 >60 88% 5% : 11% 1 0 0 0 0 0 0

20% 0 0 0 1 23746 663 1144 0 0 0 >60 88% 0% - 8% 1 0 0 0 0 0 0

23% 0 0 0 3 29419 762 1340 0 0 0 >60 88% 0% - 9% 3 0 0 0 0 0 0

30% 0 0 0 16 81171 2189 8796 41272 0 0 >60 92% 36% Hf 31% 16 0 0 0 0 0 0

19% 0 0 0 1 23010 610 1168 114004 0 0 >60 90% 83% : 67% 1 0 0 0 0 0 0

21% 0 0 0 1 22341 630 1412 0 0 0 >60 88% 0% - 13% 1 0 0 0 0 0 0

17% 0 0 0 0 14863 422 932 0 0 0 >60 88% 0% - 8% 0 0 0 0 0 0 0

27% 0 0 0 0 19099 499 1492 0 0 0 >60 87% 0% - 7% 0 0 0 0 0 0 0

21% 0 0 0 16 29642 765 6504 94544 0 0 >60 91% 90% Hf 55% 16 0 0 0 0 0 0

16% 0 0 0 1 19368 568 3080 61488 0 0 >60 89% 96% : 63% 1 0 0 0 0 0 0

21% 0 0 0 0 1121 29 1363 0 0 0 >60 86% 0% - 8% 0 0 0 0 0 0 0

26% 0 0 0 0 65267 1753 1380 0 0 0 >60 89% 0% - 13% 0 0 0 0 0 0 0

18% 0 0 0 143 17263 490 1036 0 0 0 >60 88% 0% - 6% 143 0 0 0 0 0 0

I just don't get it.

NetApp_SEAL · ‎2014-09-09

Lots of great info.

So - am I reading that correctly in that your Intercluster LIFs are on the same subnet as your cluster management LIF?

What interface are you using for the cluster management LIF?

Can you output the routing groups and routes for both clusters, please?

(I saw in the previous output that you included the routing groups for source cluster, but no routes. For the destination cluster, only the routing groups and routes for that one node are shown. Just trying to get a solid comparison, that's all).

Also - can you give the output for the following command?

node run -node <node> route -gsn

cDOT routing groups handle all the traffic regardless if you're working with layer 2 or layer 3. These rules apply based on the routing group configuration within the cluster and its nodes. It's only when you're working with layer 3 that you need a specific routing group route.

So, let's say your cluster management LIF does, in fact, happen to be on a 1 Gb port. Your cluster management LIF belongs to routing group c10.230.0.0/16. Your Intercluster LIFs belong to routing group i10.230.0.0/16. Those routing groups are most likely sharing the same default gateway, and that could lead to traffic routing over the cluster management LIF.

Also - am I reading the output correctly in that you seem to be replicating over different port configuration types? Your source shows the LIF on port e2b (assuming an access port) and your destination shows e1a-230, which is a tagged VLAN over an ifgrp (assuming a trunk port). I've seen an issue before where this definitely leads to issues but haven't had a chance to dig back into it. The result was to move the Intercluster traffic back to dedicated 1 Gb ports for more consistent traffic flow. I hope to re-visit this specific issue later in the week to troubleshoot like I did with the one I posted about in the other thread.

I know that some of this might seem like grasping at straws, but can't hurt, right?

NetApp_SEAL · ‎2014-09-09

And as a follow-up for additional information (after validating with another resource):

- Routing groups define rules for all traffic, not just layer 3

- Routing group routes are specific to layer 3

- The default gateway of a cluster is defined by the cluster management LIF

- The default gateway of a node is defined by the node management LIF

So here, if you have the option of moving the cluster management LIF to another VLAN (can it go on the same network as the node management LIFs?), that might be a worthy step in testing.

mark_schuren · ‎2014-09-10

Here is all info from my source cluster (info regarding data-vservers was stripped):

ucnlabcm01::> routing-groups show

(network routing-groups show)

Routing

Vserver Group Subnet Role Metric

--------- --------- --------------- ------------ -------

ucnlabcm01

c10.230.0.0/16

10.230.0.0/16 cluster-mgmt 20

ucnlabcm01-03

c169.254.0.0/16

169.254.0.0/16 cluster 30

i10.230.0.0/16

10.230.0.0/16 intercluster 40

n10.210.0.0/16

10.210.0.0/16 node-mgmt 10

ucnlabcm01-04

c169.254.0.0/16

169.254.0.0/16 cluster 30

i10.230.0.0/16

10.230.0.0/16 intercluster 40

n10.210.0.0/16

10.210.0.0/16 node-mgmt 10

ucnlabcm01::> routing-groups route show

(network routing-groups route show)

Routing

Vserver Group Destination Gateway Metric

--------- --------- --------------- --------------- ------

ucnlabcm01

c10.230.0.0/16

0.0.0.0/0 10.230.254.254 20

ucnlabcm01-03

n10.210.0.0/16

0.0.0.0/0 10.210.254.254 10

ucnlabcm01-04

n10.210.0.0/16

0.0.0.0/0 10.210.254.254 10

ucnlabcm01::> run -node ucnlabcm01-03 route -gsn

Routing tables

Routing group: __default_grp

Internet:

Destination Gateway Flags Refs Use Interface

127.0.0.1 127.0.0.1 UH 0 0 lo

127.0.10.1 127.0.20.1 UHS 4 3072302 losk

Routing group: ucnlabcm01-03_c169.254.0.0/16

Internet:

Destination Gateway Flags Refs Use Interface

169.254 link#3 UC 0 0 e1a

169.254.77.92 90:e2:ba:3d:e3:14 UHL 3 2237583 lo

169.254.175.117 90:e2:ba:2b:38:b8 UHL 87 56775566 e1a

169.254.200.34 90:e2:ba:2b:36:54 UHL 0 75520 lo

169.254.204.211 90:e2:ba:37:51:f4 UHL 93 52139103 e1a

Routing group: ucnlabcm01-03_n10.210.0.0/16

Internet:

Destination Gateway Flags Refs Use Interface

default 10.210.254.254 UGS 0 352399 e0M

10.210/16 link#13 UC 0 0 e0M

10.210.1.7 0:a0:98:1a:82:bb UHL 0 6624 e0M

10.210.4.11 link#13 UHRL 0 3983 e0M

10.210.104.14 0:a0:98:13:d2:4 UHL 0 1 e0M

10.210.254.1 54:7f:ee:b9:22:bc UHL 0 0 e0M

10.210.254.2 54:7f:ee:bb:60:3c UHL 0 0 e0M

10.210.254.6 0:12:1:72:6e:ff UHL 0 984 e0M

10.210.254.41 0:a0:98:e7:3b:bc UHL 0 1384574 e0M

10.210.254.42 0:a0:98:e7:39:12 UHL 0 1330125 e0M

10.210.254.254 0:0:c:9f:f0:d2 UHL 1 0 e0M

Routing group: ucnlabcm01-03_i10.230.0.0/16

Internet:

Destination Gateway Flags Refs Use Interface

10.230/16 link#6 UC 0 0 e2b

10.230.4.41 0:c0:dd:26:3:5c UHL 28 36779 e2b

10.230.254.1 54:7f:ee:b9:22:bc UHL 0 0 e2b

10.230.254.2 54:7f:ee:bb:60:3c UHL 0 0 e2b

Destination (ucnlabcm04 is a single node cluster):

ucnlabcm04::> routing-groups show

(network routing-groups show)

Routing

Vserver Group Subnet Role Metric

--------- --------- --------------- ------------ -------

ucnlabcm04

c10.230.0.0/16

10.230.0.0/16 cluster-mgmt 20

ucnlabcm04-01

i10.230.0.0/16

10.230.0.0/16 intercluster 40

n10.210.0.0/16

10.210.0.0/16 node-mgmt 10

ucnlabcm04::> routing-groups route show

(network routing-groups route show)

Routing

Vserver Group Destination Gateway Metric

--------- --------- --------------- --------------- ------

ucnlabcm04

c10.230.0.0/16

0.0.0.0/0 10.230.254.254 20

ucnlabcm04-01

n10.210.0.0/16

0.0.0.0/0 10.210.254.254 10

ucnlabcm04::> run -node ucnlabcm04-01 route -gsn

Routing tables

Routing group: __default_grp

Internet:

Destination Gateway Flags Refs Use Interface

127.0.0.1 127.0.0.1 UH 0 0 lo

127.0.10.1 127.0.20.1 UHS 4 20424 losk

Routing group: ucnlabcm04-01_n10.210.0.0/16

Internet:

Destination Gateway Flags Refs Use Interface

default 10.210.254.254 UGS 0 119508 e0M

10.210/16 link#11 UC 0 0 e0M

10.210.254.1 54:7f:ee:b9:22:bc UHL 0 0 e0M

10.210.254.2 54:7f:ee:bb:60:3c UHL 0 0 e0M

10.210.254.6 0:12:1:72:6e:ff UHL 0 116 e0M

10.210.254.254 link#11 UHL 1 0 e0M

Routing group: ucnlabcm04_c10.230.0.0/16

Internet:

Destination Gateway Flags Refs Use Interface

default 10.230.254.254 UGS 0 2011735 e1b-230

10.230/16 link#20 UC 0 0 e1b-230

10.230.3.21 0:50:56:b7:7e:85 UHL 0 0 e1b-230

10.230.3.120 0:25:b5:0:1:bf UHL 0 2984 e1b-230

10.230.3.121 0:50:56:84:4:24 UHL 0 13 e1b-230

10.230.3.126 0:50:56:84:5c:df UHL 0 0 e1b-230

10.230.3.129 0:50:56:84:1:33 UHL 0 351729 e1b-230

10.230.3.134 0:50:56:84:72:86 UHL 0 98431 e1b-230

10.230.3.151 0:50:56:84:24:6d UHL 0 932 e1b-230

10.230.3.152 0:50:56:84:24:6b UHL 0 612 e1b-230

10.230.81.201 0:50:56:b7:6c:a7 UHL 0 0 e1b-230

10.230.81.202 0:50:56:b7:11:36 UHL 0 0 e1b-230

10.230.254.1 54:7f:ee:b9:22:bc UHL 0 0 e1b-230

10.230.254.2 54:7f:ee:bb:60:3c UHL 0 0 e1b-230

10.230.254.254 0:0:c:9f:f0:e6 UHL 1 0 e1b-230

Routing group: ucnlabclu04-01_i10.230.0.0/16

Internet:

Destination Gateway Flags Refs Use Interface

10.230/16 link#17 UC 0 0 e1a-230

10.230.4.33 90:e2:ba:3d:e3:15 UHL 29 224063345 e1a-230

10.230.4.34 0:a0:98:13:d1:ff UHL 28 18292 e1a-230

I don't see any gateway involved for any snapmirror transfer. The intercluster routing groups do not contain any gateway...

The cluster-management LIF is in the same "server" VLAN as the intercluster LIFs - the "device mgmt" VLAN is only for the "slow" 100MBit SP and e0M ports.

I *could* move it but I don't see a real reason behind it (according to the above outputs)...?

NetApp_SEAL · ‎2014-09-10

Ok, cool. So yeah, I get the gateway deal, and that's fine. Working as expected.

What PORT is the cluster management LIF currently on? I know it's on the same VLAN, but what port specifically?

Also - is there the possibility to move the IC LIFs to a 10 Gb ifgrp on the source (similar to how the destination is set up)?

mark_schuren · ‎2014-09-10

It is currently on port e0a. I just moved that to e2b (same port as IC lif) and started a new snapmirror relationship. Does not change anything in the behaviour.

However when I put the IC lif onto a 1GbE port (e0a or any other 1G port in the correct VLAN), the traffic immediately flows smoothly at 1GBit/s. As soon as I put it back to a 10g port the traffic is peaky and in average BELOW 1 GBit/s...

We meanwhile have a second support case opened for this issue (for a customer site where management and intercluster are truely different VLANs). So I'm obivously not the only one

However, thanks a lot for your inputs!

NetApp_SEAL · ‎2014-09-10

You're welcome. Indeed it seems like this is part of a larger issue that needs to be addressed.

Perhaps there's not enough folks trying to do SnapMirror / SnapVault over 10 Gb links yet?

Please keep me updated with your interactions with Support. I can assume that they're going to ask a LOT of what has already been covered in this thread, no?

I'll continue to test things on my end with another client and report back here with anything new.

mark_schuren · ‎2014-09-19

Oh yes they asked a LOT - I uploaded tons of traces...

But finally we narrowed it down - in my case my destination is losing ethernet frames!

There are frame drops within the destination NIC it seems - no matter which NIC I have the IC LIF sitting on...

These frames are seen on the Nexus, but not in a pktt on the destination, causing DUP ACKs and TCP retransmits (which also do never hit the destination properly)!

This is true for snapmirror traffic ONLY! Performance tests with NFS and iSCSI do not show this behaviour (no drops, no retransmits).

The issue is now at Netapp engineering for finding / fixing the root cause.

However, I have a very good workaround (which speeds up TCP retransmits massively):

1. On the SOURCE node(s) set options ip.tcp.rtt_min to a very low value (e.g. 10 or lower)

2. On the DESTINATION node(s) set options ip.tcp.delack.enable off

3. Reset all intercluster TCP connections (e.g. by down/up the destination's intercluster LIF(s), or reboot one affected node, or whatever helps resetting TCP connections)

Now I at least get constant throughput ~250MByte/sec, and am finally maxing out my disks on the destination - constantly!

Until the root cause is fixed, I can now definitely live with it

Might be interesting for others, so I post my findings here The issue has been observed when a FAS32xx is replicating to another FAS32xx cluster over 10g (low latency network) only. Other combinations or models are probably unaffected.

MK · ‎2014-12-30

Thank you thank you thank you!

I have been pulling my hair out for the past few weeks trying to figure this exact issue out. We are replicating from a FAS8020 to a FAS3210 via 10G ports via Appliance ports ona UCS FI. Each filer is directly connected to the same FI and on a non-routed private VLAN. Same exact issue. The snapmirror would run like a dog on the 10G links, even though NFS was working without issue on the same ports. Putting in your suggested tweaks has improved the replication from 50Mb/sec to 1900Mb/sec! 🙂

bobshouseofcards · ‎2014-12-31

Can't compare to doing XDP relationships over 10G, but I can share my experience running standard Snapmirror (DP) over long distance 10G (1000 mile replication, latency 24ms round trip).

My sources are two 4-node clusters - "small" (2x6240,2x6220) and "big" (2x6280, 2x6290) both replicating to a single 4 node (4x8060) cluster at the target. The "small" source cluster is mostly high capacity disks (3/4TB), the "big" source cluster is all performance disk (600/900GB). All replications are to capacity disk (4TB) at the destination end.

My prep test was to run volume moves internally on both source clusters. Volume move for all intents is an internal snapmirror (with obvious bonuses for keeping it live all the time). I used these tests to set expectations of how fast a snapmirror might function. At best my small cluster would go just under 1Gbps for volume moves and my big cluster would go closer to 1.5Gbps. These set the limits for a single replicaiton speed I'd expect.

For the replications, I maxed out the WAN receive buffers ahead of time (because of the low latency in the network):

network connections*> options buffer show
Service Layer 4 Protocol Network Receive Buffer Size (KB) Auto-Tune?
----------- ---------------- ------- ------------------------ ----------
ctlopcp TCP WAN 7168 true
ctlopcp TCP LAN 256 false

I've found that transfers from capacity disk are very much limited by disk speeds. I control all replications with scripts because the standard schedules don't work well enough to keep the pipe at maximum speed - replicating updates about 1800 volumes daily. For the small cluster, I limit current replications to 3 per aggregate, which seems to be as fast as the aggregate will go before disk limitations slow down individual transfers. The control script starts updates and monitors every 5 minutes to start up new replications in each aggregate. Even so - the best the "small" cluster has ever achieved is about 5.5Gbps over time, and individual transfers are always less than 1Gbps. Best I've ever seen on an individual transfer is about 700Mbps.

The large cluster just fires on usual schedules without regard for aggregates. Nodes don't seem to sneeze at it much - the large cluster can easily fill the 10G pipe with multiple transfers, though individual transfers max out near the 1.5Gbps mark similar to the internal move.

uptimenow · ‎2015-04-07

We are facing a similar problem, basically SnapMirror/SnapVault is slow, even on 1 Gbps links, with speeds reaching only 10-15% of the available bandwidth In our troubleshooting, we have found that the common ground for this problem to occur is that the intercluster LIF(s) are on etherchannel ports (ifgrp) that are using post-based load balancing. When using IP-based load balancing (which is the default when creating ifgrps) or even working active-passibely we are not seeing this performance degradation. Furthermore if we do use port-based load balancing on the source controller, yet disable all but one physical port, SnapMirror/SnapVault throughput is very good. Note that SnapMirror/SnapVault in cDOT tends to open a multitude of TCP sessions between source and destination system.

So I am wondring of any of you facing SnapVault performance issues on 10 Gbps network is using port-based load balancing by any chance ?