Subscribe
Accepted Solution

HA failover configuration

Hi All,

When installed our NetApp was setup with e0a and eob for a virtual interface called  Internal_VIF for CIFS traffic and mirror for HA failover on the other controller.  e0c and e0d similar for a virtual Interface called Vmware_VIF for Vmware traffic on NFS.

In order to use iSCSI as well, VLAN's were created on e0a and e0b so that CIFS and iSCSI traffic could co-exist and route.

The following configuration works.

#Controller A

ifgrp create multi Int1_VIF e0a e0b
vlan create Int1_VIF 20
ifconfig Int1_VIF 192.168.1.196 netmask 255.255.255.0 partner Int2_VIF mtusize 1500 trusted wins up nfo
ifconfig Int1_VIF-20 192.168.20.10 netmask 255.255.255.0 partner Int2_VIF-20 mtusize 1500 trusted wins up nfo

#Controller B

ifgrp create multi Int2_VIF e0a e0b
vlan create Int2_VIF 20
ifconfig Int2_VIF 192.168.1.197 netmask 255.255.255.0 partner Int1_VIF mtusize 1500 trusted wins up nfo
ifconfig Int2_VIF-20 192.168.20.11 netmask 255.255.255.0 Int1_VIF-20 mtusize 1500 trusted wins up nfo

Int1_VIF and Int2_VIF is the CIFS traffic connected to a CISCO switch.

Int1_VIF-20 and Int2_VIF-20 which is VLAN'd on the CISCO switch to VMWare hosts for iSCSI traffic.

e0c and e0d remain for VMWare NFS traffic using storage on controller A.

The above is required for accessing storage on controller B.

Both the VMWare NFS, and Int1_VIF-20/Int2_VIF-20 iSCSI interfaces on both controllers display in OnCommand Manager correctly with HA Failover mode as Shared and the partner Interface displayed correctly (see attached).

However Int1_VIF and Int2_VIF on each controller is greyed out in properties and HA failover shows 'Dedicated' rather than Shared.  The partner interface displays the correct interface wihout the IP in brackets.

Is the configuration above correct?  And is OnCommand manager just displaying incorrectly?  Or does the configuration need to change?  How can I confirm HA failvoer is working correctly?

Cheers,

Neil

Re: HA failover configuration

Until Data ONTAP 8.x it was not supported to configure both base and VLAN interfaces at the same time. May be, OCSM still does not expect it. Which Data ONTAP version do you have? Make sure to install the latest OCSM 2.0.1; if you still can reproduce it, post on System Manager community.

Re: HA failover configuration

I am running Ontap 8.0.2.7-mode.  I had OCSM 2.0.0.1017 installed.  I just upgraded to the latest 2.0.1.1401 and the problem remains.  However the interface is a lot fasterSmiley Happy  System Manager Community will do.

Thanks.

Re: HA failover configuration

Yes, I noticed too that 2.0.1 became much more responsive. Good to see it is improving ☺

Re: HA failover configuration

[ Edited ]

OK I've amanged to sucessfully resolve this via command line and failover occurs correctly.  I also modified for LACP, but the config works just fine for MULTI as well. Just replace the word lacp with multi.

 

This is the running config for filer A:

 

ifgrp create lacp Internal_VIF e0a e0b

vlan create Internal_VIF 20

ifgrp create lacp Vmware_VIF

ifconfig Internal_VIF 192.168.1.197 netmask 255.255.255.0 partner Internal_VIF mtusize 1500 trusted wins up nfo

ifconfig Internal_VIF-20 192.168.20.11 netmask 255.255.255.0 Internal_VIF-20 mtusize 1500 trusted wins up nfo

ifconfig Vmware_VIF 192.168.10.11 netmask 255.255.255.0 partner Vmware_VIF mtusize 1500 trusted wins up nfo

 

This is the running config for filer B:

 

ifgrp create lacp Internal_VIF e0a e0b

vlan create Internal_VIF 20

ifgrp create lacp Vmware_VIF

ifconfig Internal_VIF 192.168.1.196 netmask 255.255.255.0 partner Internal_VIF mtusize 1500 trusted wins up nfo

ifconfig Internal_VIF-20 192.168.20.10 netmask 255.255.255.0 partner Internal_VIF-20 mtusize 1500 trusted wins up nfo

ifconfig Vmware_VIF 192.168.10.10 netmask 255.255.255.0 partner Vmware_VIF mtusize 1500 trusted wins up nfo

 

For sucessful take over, the hosts and rc file on each filer needs to be edited with the following commands.

 

wrfile /etc/rc and wrfile /etc/hosts

just copy and paste into the session and CTRL+C to finish editing.

rdfile /etc/rc and rdfile /etc/hosts to read them afterwards to confirm

 

Hosts file on filer A:

 

127.0.0.1 localhost localhost-stack

127.0.10.1 localhost-10 localhost-bsd

127.0.20.1 localhost-20 localhost-sk

192.168.1.196 MANF01a MANF01a-Internal_VIF

192.168.20.10 MANF01a-Internal_VIF-20

192.168.10.10 MANF01a-Vmware_VIF

192.168.1.193 mailhost

 

rc file on filer A:

 

ifgrp create lacp Vmware_VIF -b ip e0d e0c

ifgrp create lacp Internal_VIF -b ip e0b e0a

vlan create Internal_VIF 20

ifconfig Internal_VIF `hostname`-Internal_VIF netmask 255.255.255.0 partner Internal_VIF mtusize 1500 trusted wins up nfo

ifconfig Internal_VIF-20 `hostname`-Internal_VIF-20 netmask 255.255.255.0 partner Internal_VIF-20 mtusize 1500 trusted wins up nfo

ifconfig Vmware_VIF `hostname`-Vmware_VIF netmask 255.255.255.0 partner Vmware_VIF mtusize 1500 trusted wins up nfo

route add default 192.168.1.150 1

routed on

options dns.enable on

options dns.domainname YOURDOMAIN.COM

options nis.enable off

savecore

 

Hosts file on filer B:

 

127.0.0.1 localhost localhost-stack

127.0.10.1 localhost-10 localhost-bsd

127.0.20.1 localhost-20 localhost-sk

192.168.1.197 MANF01b MANF01b-Internal_VIF

192.168.20.11 MANF01b-Internal_VIF-20

192.168.10.11 MANF01b-Vmware_VIF

192.168.1.193 mailhost

 

rc file on filer B:

 

hostname MANF01b

ifgrp create lacp Vmware_VIF -b ip e0c e0d

ifgrp create lacp Internal_VIF -b ip e0b e0a

vlan create Internal_VIF 20

ifconfig Internal_VIF `hostname`-Internal_VIF netmask 255.255.255.0 partner Internal_VIF mtusize 1500 trusted wins up nfo

ifconfig Internal_VIF-20 `hostname`-Internal_VIF-20 netmask 255.255.255.0 partner Internal_VIF-20 mtusize 1500 trusted wins up nfo

ifconfig Vmware_VIF `hostname`-Vmware_VIF netmask 255.255.255.0 partner Vmware_VIF mtusize 1500 trusted wins up nfo

route add default 192.168.1.150 1

routed on

options dns.enable on

options dns.domainname YOURDOMAIN.COM

options nis.enable off

savecore

 

The follwoing options also need to be set for failover to work correctly:

 

options cf.takeover.on_network_interface_failure on

options cf.takeover.on_network_interface_failure.policy any_nic

 

Also make sure your clocks are near identical on each filer, use the date command to verify, otherwise takeover can mess up and copy the config of the other partner instead causing a duplication of interfaces and IP's.

 

options timed.enable on

options timed.proto ntp

options timed.servers pool.ntp.org  <- or other time server, but has to be identical on both filers.

 

Check that your NTP protocol port is actually open for both filers on your firewall.

 

Lastly use the excellent HA config checker tool.  If you have even the smallest typo in your rc and hosts files failover fails.  Even upper and lowercase miss matches will casue it to fail or cause strange results.

 

http://support.netapp.com/NOW/download/tools/cf_config_check/

 

I found the Windows exe a waste of time and couldn't get it to run.  However the CGI file was sucessfully ran with the following command:

 

perl ha-config.check.cgi -s root@192.168.1.196 root@192.168.1.197

in order to run this I had to run from a linux box (ubuntu).

 

Please note the above can't be done on oncommand manager at the time of writing and when you do the above it displays incorrectly in the manager as being dedicated and not shared for the Internal_VIF and is greyed out.  This is a bug in the manager which I am following up with NetApp currently.

 

 

 

Cheers,

 

Neil

Re: HA failover configuration

takeover can mess up and copy the config of the other partner instead 

Takeover does not copy any configuration file. Could you explain what do you mean?

Any reason you insist on having both base vif and vlan? As I understand you are setting new configuration, so you could just as well use two vlans. This is something that has been working for years.

Re: HA failover configuration

Becasue I need VLAN 1 untagged, and as soon as VLAN 1 it, it tags the traffic.  CISCO switches are default VLAN 1 untagged.

With the time protocol being out, whether it copied it or not, the result was both filers had the same config that was on filer B.  Very weird.