ONTAP Hardware
ONTAP Hardware
We have two controllers in Netaapp FAS8020 ontap 7-mode. HA is enabled and working fine.
All SATA disks are configured in controller 1 and controller 2 has SAS disks.
I would like to restart controllers one by one and check the redundancy.
Since i am new in netapp storage, could anyone tell what needs to check in storage before restarting the controllers?
How the restart will get affect for ESXi host servers?
please help
Regards
Anu
Solved! See The Solution
Hi there!
First step is to login to each controller/node. From here, you want to verify that the storage is visible to both nodes, and that the network interfaces will failover appropriately.
To check the health of the backend storage connections, run the command: "sysconfig -a" to show the system status. You are looking for a line that says:
System Storage Configuration: Multi-Path HA
The next step is to view the network interface running status and failover status. Two commands for this: rdfile /etc/rc and ifconfig -a
From the output of the /etc/rc file, see which IPs failover to which ports on the other controller (partner statement). On the partner system, check those ports exist, and vis-versa. You are welcome to post the output of those files here for us to review, but I encourage you to redact any information that may expose external IPs or company names.
It is not that friendly, and we have updated this mechanism significantly in newer versions of ONTAP.
Then you want to check that the hosts using storage will tolerate the up-to-30 second blip during failover (usually much shorter, but plan for worst case). If running iSCSI or FC, ensure ALUA is setup and that hosts can see targets on both controllers.
Assuming all is configured correctly, it should all go smoothly.
Hope this helps!
Hi there!
First step is to login to each controller/node. From here, you want to verify that the storage is visible to both nodes, and that the network interfaces will failover appropriately.
To check the health of the backend storage connections, run the command: "sysconfig -a" to show the system status. You are looking for a line that says:
System Storage Configuration: Multi-Path HA
The next step is to view the network interface running status and failover status. Two commands for this: rdfile /etc/rc and ifconfig -a
From the output of the /etc/rc file, see which IPs failover to which ports on the other controller (partner statement). On the partner system, check those ports exist, and vis-versa. You are welcome to post the output of those files here for us to review, but I encourage you to redact any information that may expose external IPs or company names.
It is not that friendly, and we have updated this mechanism significantly in newer versions of ONTAP.
Then you want to check that the hosts using storage will tolerate the up-to-30 second blip during failover (usually much shorter, but plan for worst case). If running iSCSI or FC, ensure ALUA is setup and that hosts can see targets on both controllers.
Assuming all is configured correctly, it should all go smoothly.
Hope this helps!
Good info from Alex.
It's also a great idea to check system configuration/health with Active IQ Config Advisor (a downloadable tool from the NetApp Support site).
See: https://mysupport.netapp.com/tools/info/ECMS1357843I.html?productID=61923&pcfContentID=ECMS1357843
Hi Alax,
Thanks for your information. Apreciated.
Please see my comments.
1. I have checked the hosts and LUNS are visible from both the adapters.
2. Ran the command sysconfig -a in both controllers and Multi Path HA is configured.
3. Attaching the rc file and ifconfig output. please find the attachment.
I have recently added the partner IP information in netapp02 rc file. Partner IP address was not showing earlier.
Regards
Anu
hi there! can you also please post "rdfile /etc/hosts" from both controllers?
Hi there! A couple of things spring out
Can you also run "sp status" on both systems to ensure you have out of band access?
Hi Alex,
Thanks for your help.
I have added the e0M entry in host file and error gone.
netapp02> wrfile /etc/hosts
127.0.0.1 localhost localhost-stack
127.0.10.1 localhost-10 localhost-bsd
127.0.20.1 localhost-20 localhost-sk
10.240.10.62 netapp02 netapp02-e0M
10.240.30.221 netapp02-e0e
130.100.100.3 netapp02-e0f
10.241.10.62 drnetapp02
10.241.10.61 drnetapp01
Can you also run "sp status" on both systems to ensure you have out of band access?
[Anu] please see the result
netapp01> sp status
The "sp" command is deprecated. Please use "system node service-processor" instead.
Service Processor Status: Online
Firmware Version: 3.0.2
Mgmt MAC Address: 11:11:11:11:11:11
Ethernet Link: up, 100Mb, full duplex, auto-neg complete
Using DHCP: no
IPv4 configuration:
IP Address: 10.240.10.68
Netmask: 255.255.255.0
Gateway: 10.240.10.1
IPv6 configuration: Disabled
netapp01>
netapp02> sp status
The "sp" command is deprecated. Please use "system node service-processor" instead.
Service Processor Status: Online
Firmware Version: 3.0.2
Mgmt MAC Address: 11:11:11:11:11:11
Ethernet Link: up, 100Mb, full duplex, auto-neg complete
Using DHCP: no
IPv4 configuration:
IP Address: 10.240.10.69
Netmask: 255.255.255.0
Gateway: 10.240.10.1
IPv6 configuration: Disabled
netapp02>
Regards
Anu
Ok! So @andris 's suggestion of running config advisor is also a good one to be extra sure - download from https://mysupport.netapp.com/tools/info/ECMS1357843I.html
From a manual review, I suggest logging and running sysconfig and confirming it says "System Storage Configuration: Multi-Path HA"
Once you're satisfied that you're ready to proceed, ssh to both systems via their SP management IP (shown from SP status) - this gives a virtual serial interface. On the node you wish to survive the first takeover, run "cf takeover" - the other node will then reboot and eventually display "Waiting for giveback". At this time you are running on one node only, but with the disks and config from both systems. When you're comfortable it works, run "cf giveback" on the surviving node. A giveback successful message will eventually display. At that time you can then test takeover of the first node by doing the same thing.
If you have a support contract, you can call our support center in case of problems, but hopefully it all goes well - from what we've gone through here it looks like you're in a good position to start, pending Config Advisor review.
Thanks Alex for your support.
I have restarted the Controllers and confirmed everything is working fine.
Once again thanks for your help.
Regards
Anu
Thanks for the follow up, glad to hear it all went ok!