ONTAP Hardware

Powering up a failed controller on its partner under hardware failure

lmunro_hug
14,805 Views

Hi,

Can anyone confirm how you would power up a controller on its partner node if a graceful takeover did not take place? I can describe this in the following scenario below.

You have 2 controllers in a HA pair, controller_A and controller_B. You need to do some sort of maintenance or system move that requires both controllers to be shut down. While working on controller_B you damage the controller hardware that does not allow it to POST when powering on. Controller_A works fine and can boot successfully, so the question is how do you power on controller_A’s partner (controller_B) that has failed? Does this happen automatically after controller_A does not receive a heartbeat from controller_B?

If not can you do a partner, then boot_ontap from controller_A or similar?

Many Thanks
Luke

1 ACCEPTED SOLUTION

scottgelb
14,803 Views

I was going to try on the FAS3240AE in our lab, except it has UCS boot luns on both nodes... our other SEs wouldn't like me halting a node...but I was able to get an old FAS2020A with ONTAP 7.3 and try it out.  Andrey was right...forcetakeover does bring the partner node up.  See console below... I did a halt -f on the partner node...simulating a node that didn't come up.  Then cf forcetakeover worked.  I then booted node2 and it came up waiting for giveback and I was able to cf giveback.

I wasn't sure about this until testing it...glad we have this community to learn and relearn what we forget  

node1> cf status

node2 may be down, takeover disabled because of reason (partner halted in notakeover mode)

node1 has disabled takeover by node2 (interconnect error)

VIA Interconnect is down (link down).

node1> cf takeover

cf: takeover cannot be performed because of reason (partner halted in notakeover mode)

node1> cf takeover -f

cf: takeover cannot be performed because of reason (partner halted in notakeover mode)

node1> cf forcetakeover

cf forcetakeover may lead to data corruption; really force a takeover? y

cf: forcetakeover initiated by operator

node1(takeover)>

View solution in original post

14 REPLIES 14
Public