Solved: Re: Restart the Controllers 7 mode HA

Anu61 · ‎2019-07-31

We have two controllers in Netaapp FAS8020 ontap 7-mode. HA is enabled and working fine.

All SATA disks are configured in controller 1 and controller 2 has SAS disks.

I would like to restart controllers one by one and check the redundancy.

Since i am new in netapp storage, could anyone tell what needs to check in storage before restarting the controllers?

How the restart will get affect for ESXi host servers?

please help

Regards

Anu

AlexDawson · ‎2019-07-31

Hi there!

First step is to login to each controller/node. From here, you want to verify that the storage is visible to both nodes, and that the network interfaces will failover appropriately.

To check the health of the backend storage connections, run the command: "sysconfig -a" to show the system status. You are looking for a line that says:

System Storage Configuration: Multi-Path HA

The next step is to view the network interface running status and failover status. Two commands for this: rdfile /etc/rc and ifconfig -a

From the output of the /etc/rc file, see which IPs failover to which ports on the other controller (partner statement). On the partner system, check those ports exist, and vis-versa. You are welcome to post the output of those files here for us to review, but I encourage you to redact any information that may expose external IPs or company names.

It is not that friendly, and we have updated this mechanism significantly in newer versions of ONTAP.

Then you want to check that the hosts using storage will tolerate the up-to-30 second blip during failover (usually much shorter, but plan for worst case). If running iSCSI or FC, ensure ALUA is setup and that hosts can see targets on both controllers.

Assuming all is configured correctly, it should all go smoothly.

Hope this helps!

View solution in original post

AlexDawson · ‎2019-07-31

Hi there!

First step is to login to each controller/node. From here, you want to verify that the storage is visible to both nodes, and that the network interfaces will failover appropriately.

To check the health of the backend storage connections, run the command: "sysconfig -a" to show the system status. You are looking for a line that says:

System Storage Configuration: Multi-Path HA

The next step is to view the network interface running status and failover status. Two commands for this: rdfile /etc/rc and ifconfig -a

From the output of the /etc/rc file, see which IPs failover to which ports on the other controller (partner statement). On the partner system, check those ports exist, and vis-versa. You are welcome to post the output of those files here for us to review, but I encourage you to redact any information that may expose external IPs or company names.

It is not that friendly, and we have updated this mechanism significantly in newer versions of ONTAP.

Then you want to check that the hosts using storage will tolerate the up-to-30 second blip during failover (usually much shorter, but plan for worst case). If running iSCSI or FC, ensure ALUA is setup and that hosts can see targets on both controllers.

Assuming all is configured correctly, it should all go smoothly.

Hope this helps!

andris · ‎2019-07-31

Good info from Alex.

It's also a great idea to check system configuration/health with Active IQ Config Advisor (a downloadable tool from the NetApp Support site).

See: https://mysupport.netapp.com/tools/info/ECMS1357843I.html?productID=61923&pcfContentID=ECMS1357843

Anu61 · ‎2019-08-01

Hi Alax,

Thanks for your information. Apreciated.

Please see my comments.

1. I have checked the hosts and LUNS are visible from both the adapters.

2. Ran the command sysconfig -a in both controllers and Multi Path HA is configured.

3. Attaching the rc file and ifconfig output. please find the attachment.

I have recently added the partner IP information in netapp02 rc file. Partner IP address was not showing earlier.

Regards

Anu

AlexDawson · ‎2019-08-04

hi there! can you also please post "rdfile /etc/hosts" from both controllers?

Anu61 · ‎2019-08-06

Hi Alex,

I am attaching the hosts file details from both the controllers.

Regards

Anu

AlexDawson · ‎2019-08-07

Hi there! A couple of things spring out

e0f on each controller does not have a partner statement in running or /etc/rc, meaning it will not failover to the other controller. Any systems accessing the storage through this IP will not work during takeover, however e0f is down on both controllers anyway, so probably not a big deal.
netapp02 /etc/hosts does not define netapp02-e0M, meaning the management IP will not come up at boot. run "wrfile /etc/hosts" and then paste in the full corrected contents of the file with that address
netapp02 e0e does not have a partner address assigned running (ifconfig -a) meaning it will not automatically pickup netapp01's e0e's IP address. This can be fixed by just running "ifconfig e0e `hostname`-e0e mediatype auto flowcontrol full netmask 255.255.255.0 partner e0e mtusize 1500 trusted wins up"

Can you also run "sp status" on both systems to ensure you have out of band access?

Anu61 · ‎2019-08-08

Hi Alex,

Thanks for your help.

I have added the e0M entry in host file and error gone.

netapp02> wrfile /etc/hosts
127.0.0.1 localhost localhost-stack
127.0.10.1 localhost-10 localhost-bsd
127.0.20.1 localhost-20 localhost-sk
10.240.10.62 netapp02 netapp02-e0M
10.240.30.221 netapp02-e0e
130.100.100.3 netapp02-e0f
10.241.10.62 drnetapp02
10.241.10.61 drnetapp01

Can you also run "sp status" on both systems to ensure you have out of band access?

[Anu] please see the result

netapp01> sp status
The "sp" command is deprecated. Please use "system node service-processor" instead.
Service Processor Status: Online
Firmware Version: 3.0.2
Mgmt MAC Address: 11:11:11:11:11:11
Ethernet Link: up, 100Mb, full duplex, auto-neg complete
Using DHCP: no
IPv4 configuration:
IP Address: 10.240.10.68
Netmask: 255.255.255.0
Gateway: 10.240.10.1
IPv6 configuration: Disabled
netapp01>

netapp02> sp status
The "sp" command is deprecated. Please use "system node service-processor" instead.
Service Processor Status: Online
Firmware Version: 3.0.2
Mgmt MAC Address: 11:11:11:11:11:11
Ethernet Link: up, 100Mb, full duplex, auto-neg complete
Using DHCP: no
IPv4 configuration:
IP Address: 10.240.10.69
Netmask: 255.255.255.0
Gateway: 10.240.10.1
IPv6 configuration: Disabled
netapp02>

Regards

Anu

AlexDawson · ‎2019-08-08

Ok! So @andris 's suggestion of running config advisor is also a good one to be extra sure - download from https://mysupport.netapp.com/tools/info/ECMS1357843I.html

From a manual review, I suggest logging and running sysconfig and confirming it says "System Storage Configuration: Multi-Path HA"

Once you're satisfied that you're ready to proceed, ssh to both systems via their SP management IP (shown from SP status) - this gives a virtual serial interface. On the node you wish to survive the first takeover, run "cf takeover" - the other node will then reboot and eventually display "Waiting for giveback". At this time you are running on one node only, but with the disks and config from both systems. When you're comfortable it works, run "cf giveback" on the surviving node. A giveback successful message will eventually display. At that time you can then test takeover of the first node by doing the same thing.

If you have a support contract, you can call our support center in case of problems, but hopefully it all goes well - from what we've gone through here it looks like you're in a good position to start, pending Config Advisor review.

Anu61 · ‎2019-08-24

Thanks Alex for your support.

I have restarted the Controllers and confirmed everything is working fine.

Once again thanks for your help.

Regards

Anu

AlexDawson · ‎2019-08-25

Thanks for the follow up, glad to hear it all went ok!