2013-07-18 05:04 PM
I need assistant from your expert people. We have fas2040 setup as active/active configuration and both heads have disk and storage assigned to each controller. This is strictly cifs/NSF environment without block or FC. Last week controller 2 have die on us and we currently in a takeover mode on controller 1. I purchased a refurbished controller this week to replace controller 2. I took controller 2 out of the chassis, remove the CF boot card, nvram battery, SPF module and put in the new controller that I purchased. I put back in the storage. I interrupted the controller with Crl+C and get in the BMC shell via the console. I issue BMC config and noticed that BMC does not has the setting of my old controller so I can't telnet via ssh to BMC shell. So I go ahead updated the BMC ip, gateway, etc but unable to update the controller name. I can now telnet to BMC interface but the password I have on my system doesn't work with the new controller via BMC shell.
I thought it supposes to boot from the cf card and load all the configuration from my dead controller to the new controller and all I have to do is assign the disks to the new system I'd but apparently it is not the case. So right now the new controller is in the chassis up but doesn't have the correct configuration. I have the controller sit at the boot loader.
What is the correct way step by step to get the new controller up and running with configuration from the dead controller without wipe out my existing data and configuration on my Netapp and own the disks that were belong to the dead controller.
So here is the quick capture of my system stage. Controller 1 is currently in takeover mode. Controller 2 is in the system with cf card from my old controller but not boot up nor have the correct configuration as it should be. Controller 2 is in LOADER-B stage.
Please help as the instruction replace fas20xx controller module from Netapp is so outdated or not correct.
2013-07-19 01:04 AM
BMC is synchronized by Data ONTAP when it boots and you need to complete controller replacement and perform giveback to allow Data ONTAP to boot on replacement controller. I reviewed controller replacement instructions and personally I found them pretty much accurate. Did you try to follow them before stating that they are outdated?
2013-07-19 05:56 AM
Ok, I have tried as followed.
Please choose one of the following:
(1) Normal Boot.
(2) Boot without /etc/rc.
(3) Change password.
(4) Clean configuration and initialize all disks.
(5) Maintenance mode boot.
(6) Update flash from backup config.
(7) Install new software first.
(8) Reboot node.
In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode.
FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED
NOTE: It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up.
Jul 18 13:43:19 [localhost:shelf.config.spha:info]: System is using single path HA attached storage only.
Please answer yes or no.
Continue with boot? no
any input on how to get this working again greatly appreciated ....
2013-07-20 01:28 AM
To safer side it is better to disable the CF & do this
NO! You should never do it when system is in takeover mode and of course never do it for controller replacement.
2013-07-20 01:38 AM
It is safe to just boot into maintenance mode to just record systemid. The prompt also says it: "It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up."
If you know new systemid already, you can simply skip it and proceed with disk reassignment.
2013-07-20 05:24 AM
I can lookup the system id in BMC without enter in maintenance mode. but need to get into maintenance mode to reassign the disk. By select the option 5, I got a prompt "In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode. FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED" no where in the document tell me what to do and how to make sure "that takeover is manually disabled on the partner node". I wish that netapp can produce a clear and better document and more over that they have netapp engineer monitor the forum and help us out. As right now neither one of us agreed on the correct way of doing it...