Solved: Correct Failover-Policy for a 4-node cluster

ubimet · ‎2018-10-08

Hello,

I need some help from you guys...

We have at the moment a bigger FAS 8000er series in a two node cluster setup with several SAS and SSD shelfs.

At the moment, all data lifs have broadcast-domain-wide failover-policy with one port-channel of each node in it.

Failover is working fine, when doing an update and so on.

Now we want to add a two node AFF ha pair to that clsuter. -> Now 4 nodes available.

What is the best failover-policy we should use for such a setup?

In my opinion, we should change from broadcast-domain-wide to sfo-partner-only, because in a failure or upgrade, I don't want to have all traffic going over the cluster interconnect, do I?

NetApp default would be system-defined, so LIF A on FAS node A would failover to a port on the AFF pair, right?

Is this really the right way? Would it add CPU usage or latency when data is going over the cluster interconnect? Especially the FAS is heavy in use with CPU...

Would I have any drawbacks with sfo-partner-only regarding system upgrade or failover?

Each node is connected with 4 ports to a cisco nexus vPC enables cluster, so 2 ports to switch a and 2 ports to switch b and all 4 are in one port-channel (vPC)

Thank you very much!

BR Florian

SpindleNinja · ‎2018-10-09

Are you able to make the AFF 4x10GbE for the cluster interconnect? (I didn't see a model mentioned some just don't have the ports for it)

I personally would leave them at broadcast domain wide, one downside to it is that it the SFO partner could already be running higher output through it's ports and lifs failing over to the SFO parters ports could push over 100%, where if it' broadcast domain wide they'd fail over to what the system feels is the better port.

Also, do you mean batch upgrades, not rolling? That I could potentially see an issue, but rolling is only going to do one of the HA pairs and then the next before it goes on to the next pair. where as a batch (8 nodes or higher) you'll have more than one node failing over.

View solution in original post

JGPSHNTAP · ‎2018-10-08

It's hard to give a right answer to this because you dont' tell us what type of workload you are putting on SSD's.

The cluster backbone should be 10gb, so unless you are going to crush your AFF's, i wouldn't worry about leaving it broadcast domain.

SpindleNinja · ‎2018-10-08

you could set the failover policy to "sfo-partner-only" if you wanted to keep failover on the same HA pair.

However, I've found that the cluster does a pretty good job at failing over lifs to the best port available when set to broadcast wide.

ubimet · ‎2018-10-09

Hi,

thank you very much for the answers.

The cluster interconnect from FAS nodes are 4x 10Gbit and from AFF 2x 10Gbit.

The work load on the AFF will be only via NFS and all the load from the FAS will be moved over to the new AFF system and then the older FAS will be used for not so performance intensive tasks like backups or so.

On the AFF, we will host our VMs and this will be a lot of them.

That's why I'm asking, if it would not be better, to stay local on a HA pair when lifs are failing over. I don't know, if the FAS will be able in the future to take the extra load from the AFF or at least if it will be able to forward the data through its network stack and the interconnect.

At the moment, the FAS is about 70% CPU for both nodes, sometimes more.

@SpindleNinja

will there be some drawbacks using "sfo-partner-only", when I will do updates or so?

I'm asking, because I have read somewhere, that with other than "system-wide" I could get into troubles doing a rolling upgrade or a non-disruptive upgrade.

BR Florian

SpindleNinja · ‎2018-10-09

Are you able to make the AFF 4x10GbE for the cluster interconnect? (I didn't see a model mentioned some just don't have the ports for it)

I personally would leave them at broadcast domain wide, one downside to it is that it the SFO partner could already be running higher output through it's ports and lifs failing over to the SFO parters ports could push over 100%, where if it' broadcast domain wide they'd fail over to what the system feels is the better port.

Also, do you mean batch upgrades, not rolling? That I could potentially see an issue, but rolling is only going to do one of the HA pairs and then the next before it goes on to the next pair. where as a batch (8 nodes or higher) you'll have more than one node failing over.

ubimet · ‎2018-10-11

Hi,

thank you for your answer!

Unforetunately, I can't add more ports to cluster interconnect.

Actually, network troughput was never an issue with the old FAS..., so you would recommend to stay at "broadcast-domain wide"?

More issues where CPU or disk related.

No, it was really rolling upgrade, that's why I was so confused...

We don't have enough nodes in the clsuter to do a batch upgrade.

BR Florian

SpindleNinja · ‎2018-10-11

That's usually what I leave them at. Or system default.

ubimet · ‎2018-10-11

Hi,

thank you very much! You helped me a lot

BR Florian

AlexSun0302 · ‎2018-10-11

local-only is for Cluster Interconnect LIF use,

if is normal Data LIF, need to setting broadcast-domain-wide, if you setting system-defined sometime will fail,

you can using the command to check "network interface show -role data -failover"

normally broadcast-domain-wide failover-target will in each node, but system-defiend only few nodes.