ONTAP Discussions
ONTAP Discussions
I need assistant from your expert people. We have fas2040 setup as active/active configuration and both heads have disk and storage assigned to each controller. This is strictly cifs/NSF environment without block or FC. Last week controller 2 have die on us and we currently in a takeover mode on controller 1. I purchased a refurbished controller this week to replace controller 2. I took controller 2 out of the chassis, remove the CF boot card, nvram battery, SPF module and put in the new controller that I purchased. I put back in the storage. I interrupted the controller with Crl+C and get in the BMC shell via the console. I issue BMC config and noticed that BMC does not has the setting of my old controller so I can't telnet via ssh to BMC shell. So I go ahead updated the BMC ip, gateway, etc but unable to update the controller name. I can now telnet to BMC interface but the password I have on my system doesn't work with the new controller via BMC shell.
I thought it supposes to boot from the cf card and load all the configuration from my dead controller to the new controller and all I have to do is assign the disks to the new system I'd but apparently it is not the case. So right now the new controller is in the chassis up but doesn't have the correct configuration. I have the controller sit at the boot loader.
What is the correct way step by step to get the new controller up and running with configuration from the dead controller without wipe out my existing data and configuration on my Netapp and own the disks that were belong to the dead controller.
So here is the quick capture of my system stage. Controller 1 is currently in takeover mode. Controller 2 is in the system with cf card from my old controller but not boot up nor have the correct configuration as it should be. Controller 2 is in LOADER-B stage.
Please help as the instruction replace fas20xx controller module from Netapp is so outdated or not correct.
Thanks!
J
Our on tap version is 8.1 7-mode by the way...
What is the boot option have you tried?
Saran
BMC is synchronized by Data ONTAP when it boots and you need to complete controller replacement and perform giveback to allow Data ONTAP to boot on replacement controller. I reviewed controller replacement instructions and personally I found them pretty much accurate. Did you try to follow them before stating that they are outdated?
Ok, I have tried as followed.
Please choose one of the following:
(1) Normal Boot.
(2) Boot without /etc/rc.
(3) Change password.
(4) Clean configuration and initialize all disks.
(5) Maintenance mode boot.
(6) Update flash from backup config.
(7) Install new software first.
(8) Reboot node.
Selection (1-8)?
In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode.
FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED
NOTE: It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up.
Jul 18 13:43:19 [localhost:shelf.config.spha:info]: System is using single path HA attached storage only.
Please answer yes or no.
Continue with boot? no
any input on how to get this working again greatly appreciated ....
you can give "yes" at continue with boot? option.
Saran
To safer side it is better to disable the CF & do this
To safer side it is better to disable the CF & do this
NO! You should never do it when system is in takeover mode and of course never do it for controller replacement.
It is safe to just boot into maintenance mode to just record systemid. The prompt also says it: "It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up."
If you know new systemid already, you can simply skip it and proceed with disk reassignment.
I can lookup the system id in BMC without enter in maintenance mode. but need to get into maintenance mode to reassign the disk. By select the option 5, I got a prompt "In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode. FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED" no where in the document tell me what to do and how to make sure "that takeover is manually disabled on the partner node". I wish that netapp can produce a clear and better document and more over that they have netapp engineer monitor the forum and help us out. As right now neither one of us agreed on the correct way of doing it...
need to get into maintenance mode to reassign the disk
You do NOT need to go into maintenance mode for that. What gave you that idea? Documentation quite clearly states that it is done from partner controller.
I wish that netapp can produce a clear and better document
There is feedback button. But I again have feeling that you did not even read documentation.
below is straight from netapp document. read step 1 on page 11 of "Replacing the controller module in a FAS20xx system". Did I miss read that or did you?
Reassigning disks on a system operating in 7-Mode
You must reassign disks before you boot the software. Some of the steps are different depending on whether the system is stand-
alone or in an HA pair.
About this task
• You must apply the commands in these steps on the correct systems:
• The target node is the node on which you are performing maintenance.
• The partner node is the HA partner of the target node.
• Do not issue any commands relating to aggregates until the entire procedure is completed.
Steps
1. If you have not already done so, reboot the target node, interrupt the boot process by entering Ctrl-C, and then select the option to boot to Maintenance mode from the displayed menu.
You must enter y when prompted to override the system ID due to a system ID mismatch.
2. View the new system IDs by entering the following command
And where pray do you see that you need to perform reassignment in maintenance mode?
It explains how to lookup systemid.
Didn't I cut and Pasted that section from the document, bold print it and also told you where to find it. I have a feeling that you not reading at all and just jump in conclusion what you think it should be. thanks for all the comments. I was hoping some one can collaborate with how-to and better construct then this...not sure which document you're reading but certainly what you stated is no where to be found in the document below.
here is the link to the document. https://library.netapp.com/ecm/ecm_download_file/ECMM1280334
thank you and have a nice day aborzenkov
Anyone beside ABORZENKOV have any experience with replace the FAS2040 controller do feel free to help me out. So far this is go now where ....
I did perform controller replacement more than once, that is why I have reasons to state that replacement procedure is correct. If you have reasons to believe this procedure is incorrect, you need to open support case and discuss it with NetApp engineer.
For the last time - you do not perform disk reassignment from maintenance mode. You halt controller after confirming systemid and reassign disks from partner node.