ONTAP Discussions

In node takeover, Time required to restart ping for FAS2520 cDot8.3.2P9

tee
2,807 Views

Hi, i have a 2-node cluster FAS2520 cDot8.3.2P9 with a cluster mgmt lif in cluster.

When i did cluster takeover, it took one and a half to two minutes to restart ping communications.

I think that this time is long, is it general?

 

And, is the document of the time required to resume ping in node takeover and giveback published?

 

 

Thanks.

1 ACCEPTED SOLUTION

AlexDawson
2,784 Views

In this instance, when the controller did the takeover, and the cluster-mgmt LIF moved, there is a forwarding timeout on your switches and routers that determines how long it takes to come back.

 

For example, assuming that node 1 e0M is plugged into port 0/17 and node 2 e0M is on 0/18.

 

Your router knows what the MAC associated with the cluster-mgmt LIF (ARP cache), and your switch knows that the MAC address associated with the cluster-mgmt LIF is on port 0/17. In the scenario where you are on the same subnet, your host (ie, laptop, server, whatever) has the ARP cache instead.

 

On failover, the NetApp system needs to move the LIF to node 2's e0M, which means its MAC address changes. The router needs to know the LIF is no longer at the same MAC, find the new MAC and start sending traffic to it. The new node hosting the LIF sends out packets from the cluster-mgmt IP, the router learns, and traffic starts flowing again.

 

So the short answer is unfortunately no, we don't have a document outlining this - as the time required is based on factors outside of NetApp's control. Hope this helps all the same.

View solution in original post

1 REPLY 1

AlexDawson
2,785 Views

In this instance, when the controller did the takeover, and the cluster-mgmt LIF moved, there is a forwarding timeout on your switches and routers that determines how long it takes to come back.

 

For example, assuming that node 1 e0M is plugged into port 0/17 and node 2 e0M is on 0/18.

 

Your router knows what the MAC associated with the cluster-mgmt LIF (ARP cache), and your switch knows that the MAC address associated with the cluster-mgmt LIF is on port 0/17. In the scenario where you are on the same subnet, your host (ie, laptop, server, whatever) has the ARP cache instead.

 

On failover, the NetApp system needs to move the LIF to node 2's e0M, which means its MAC address changes. The router needs to know the LIF is no longer at the same MAC, find the new MAC and start sending traffic to it. The new node hosting the LIF sends out packets from the cluster-mgmt IP, the router learns, and traffic starts flowing again.

 

So the short answer is unfortunately no, we don't have a document outlining this - as the time required is based on factors outside of NetApp's control. Hope this helps all the same.

Public