2017-08-03 01:42 PM
Scenario: we have a 4-node cluster. We are temporarily adding 2 additional nodes, migrating content, and decommissioning 2 nodes.
The System Admin Guide says the following about removing a node:
If the node you want to remove is the current master node, reboot the node by using the system node reboot command to enable another node in the cluster to be elected as the master node. The master node is the node that holds processes such as mgmt, vldb, vifmgr, bcomd, and crs.
Node 2 is the master nodeand is one of the nodes I'm removing. Of course I don't want to reboot and have node 1 take over since it's also being removed. Should I remove node 1, reboot node 2, then remove node 2? Why does it recommend rebooting rather than failover / giveback? Any insights will be appreciated!
2017-08-03 10:00 PM
You could try using cluster modify -node node1 -eligibility false in advanced privilege. Please note that the node will not serve data after this. This will then change the master.
Ensure Epsilon is on the node3 or node4. If not run cluster modify -node Node1 -epsilon false,
cluster modify -node Node3 -epsilon true in advanced mode.
2017-08-03 11:51 PM
Why does it recommend rebooting rather than failover / giveback? Any insights will be appreciated!
I dont see a clear answer for that. In my opinion, under normal circumstances a reboot would cause the node's resources to be taken over by the partner or other nodes in the cluster depending on the resource type and failover settings.On the other hand, a takeover will do the same thing and cause the taken over node to reboot itself. However If you are using the reboot command, you have the liberty of specifying "-inhibit-takeover" option if you decide to do so.
2017-08-04 06:00 AM
Thank you for your comments. I was planning to move epsilon in the way you mentioned, but it's the "master node" functions that has me stymied.
Your idea to change eligibility to false on node 1 is interesting. I was actually planning to fully remove node 1 from the cluster using "cluster unjoin". I suppose if I do that and THEN reboot node 2, the "master node" functions wouldn't migrate to node 1 since node 1 would be out of the cluster.
As usual, I'm just seeing a lot of gaps in NetApp documentation that leave a lot of questions . Thank you for your ideas!
Thank you for your thoughts. Using inhibit-takeover is a good idea, I hadn't thought of that.
I'm starting to think removing node 1, rebooting node 2 with inhibit-takeover, and then removing node 2 is the way to go. Would love to hear any additional feedback!