Moving epsilon for certain manually initiated takeovers
Note: Although cluster formation voting can be modified by using the cluster modify -eligibility false command, you should avoid this except for situations such as restoring the node configuration or prolonged node maintenance. If you set a node to be ineligible, it stops serving SAN data until the node is reset to eligible and rebooted. NAS data access to the node might also be affected when the node is ineligible.
And, what does it mean "might be". I translate that as a "nobody knows, try..."
Now the most important question (we must migrate other three nodes!) is this:
Assuming that we've well understood that 1. migrate lif and only 2. epsilon false, it there an official answer/doc with updated information that ensure that is this the right procedure to avoid also NAS protocols interruption?
As you've experienced service disruption at a client site, I would suggest you should log a support case with our support centre, and ask for clarification of the documentation and correct steps to avoid an outage in the future.
I've brought this issue and this thread to the attention of the writer of that KB article.
Hi, thank you for the quick answer. Is the thing that I should do also because customer need some official position before to let us the steps with inverse path. Otherwise he will plan an outage.
Another experienced Se told me that the sequence should be inverted also because if you look at the sequence after the reboot in that kb epsilon eleggibilità is the first thing so that it should be the last after aggr relocation and life move.
Just for clarification, what Giacomo is refering to is this:
Going through the steps in the KB, I would have done the epsilon and eligibility steps (Step 1 in the KB) right before the reboot (Step 8), *after* moving the aggregate and the LIFs away from the node to be worked on.
At that point in time it shouldn't disturb anything, since no user traffic should pass through this nodes interfaces or disks.
What do you think?
(I'm a little unclear about the meaning of "NFS was restarted", but I have a feeling the above change in sequence should help)
Also, if you look at the revert steps, the KB first restores eligibility and HA failover and only at the end reverts aggregates and LIFs.