ONTAP Discussions

Failover network interface issue

EricGK
5,020 Views

New AFF-250 running 9.8P1.  Dual controllers

Port channel e2a and e2c - Vlan for SVM, CIFS, NFS  - Failover enabled

Access Port e2b - vlan iSCSIA  - no failover, uses MPIO for HA

Access Port e2d - vlan iSCSIB - no failover, uses MPIO for HA

 

In this configuration, the iSCSI LIFs for node na2-01 are unreachable.  If I migrate any LIF (CIFS/NFS/SVM) over to na2-01 the iSCSI LIFs are reachable again.  Would like to know why the port channel LIFs location would have an affect on the non-port channel LIFs reachability? 

 

na2::> network interface show
Logical Status Network Current Current Is
Vserver Interface Admin/Oper Address/Mask Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
na2-01_clus1 up/up 169.254.247.99/16 na2-01 e0c true
na2-01_clus2 up/up 169.254.233.167/16 na2-01 e0d true
na2-02_clus1 up/up 169.254.207.68/16 na2-02 e0c true
na2-02_clus2 up/up 169.254.190.207/16 na2-02 e0d true
na2
cluster_mgmt up/up 10.254.0.210/24 na2-01 e0M true
na2-01_mgmt1 up/up 10.254.0.211/24 na2-01 e0M true
na2-02_mgmt1 up/up 10.254.0.212/24 na2-02 e0M true
na2-svm1
lif_na2-01-cifs up/up 10.131.20.211/24 na2-02 a0a-202 false
lif_na2-01-iscsia up/up 10.131.3.211/24 na2-01 e2b true
lif_na2-01-iscsib up/up 10.131.4.211/24 na2-01 e2d true
lif_na2-01-nfs up/up 10.131.5.211/24 na2-02 a0a-1315 false
lif_na2-02-cifs up/up 10.131.20.212/24 na2-02 a0a-202 true
lif_na2-02-iscsia up/up 10.131.3.212/24 na2-02 e2b true
lif_na2-02-iscsib up/up 10.131.4.212/24 na2-02 e2d true
lif_na2-02-nfs up/up 10.131.5.212/24 na2-02 a0a-1315 true
lif_na2-svm1_mgmt up/up 10.131.1.211/24 na2-02 a0a-1311 false
16 entries were displayed.

1 ACCEPTED SOLUTION

EricGK
4,901 Views

I agree that iSCSI should not be routed. however if I don't put a route in NetApp, then it finds a default route and goes out another interface which is bad. 

 

Here is an article I found that seems to explain it --  Network traffic not sent or sent out of an unexpected interface after upgrade to 9.2 due to elimination of IP Fastpath 

 

 

View solution in original post

8 REPLIES 8

SpindleNinja
5,000 Views

All ports healthy?  

All switch side ports configured the same? e.g. tagging/trunks/native etc etc. 

TMACMD
4,988 Views
I would put e2a/e2b/e2c/e2d into a port channel (multimode_lacp) TAG all your VLANS, including iSCSI-A and iSCSI-B Configure the switch to use an ACTIVE port-channel -> this indicates LACP (DO NOT use mode on!) Also, if possible, on the port channel at the switch, be sure to set spanning-tree to port type edge trunk (if cisco) Your data LIFs should work as long as the switch is configured. It might be helpful to post the following: system node run -node * options cdpd.enable on system node run -node * options lldp.enable on (wait 3 full minutes) set diag network device-discovery show -port e2* -sort-by protocol net int show -vserver na2-svm1 -field failover-policy, broadcast-domain

EricGK
4,962 Views

LLDP is probably not enabled on switch and not sure CDP is on for the access ports (iSCSI). Will get network team to look at that in the morning.  All IPs are pingable when everything is on their home node (Steady state).  It is just when the port channel interfaces are not.  As long as I move one back to node 1, the iSCSI IPs become reachable.  Does not seem logical why one would affect the other.  The network is Cisco ACI.  All ports look healthy in the unreachable state.  Best practice for iSCSI is not use a port-channel.

 

na2::*> network device-discovery show -port e2* -sort-by protocol
Node/ Local Discovered
Protocol Port Device (LLDP: ChassisID) Interface Platform
----------- ------ ------------------------- ---------------- ----------------
na2-02 /cdp
           e2a PCT-MAIN-LEAF101(FDO210314Q5)
                                        Ethernet1/13 N9K-C93180YC-EX
           e2c PCT-MAIN-LEAF102(FDO21021SWF)
                                        Ethernet1/13 N9K-C93180YC-EX
na2-01 /cdp
           e2a PCT-MAIN-LEAF101(FDO210314Q5)
                                        Ethernet1/11 N9K-C93180YC-EX
           e2c PCT-MAIN-LEAF102(FDO21021SWF)
                                        Ethernet1/11 N9K-C93180YC-EX

 

na2::*> network interface show -vserver na2-svm1 -fields failover-policy,broadcast-domain
vserver lif failover-policy broadcast-domain
-------- --------------- --------------- ----------------
na2-svm1 lif_na2-01-cifs system-defined BD-202
na2-svm1 lif_na2-01-iscsia disabled BD-1313
na2-svm1 lif_na2-01-iscsib disabled BD-1314
na2-svm1 lif_na2-01-nfs system-defined BD-1315
na2-svm1 lif_na2-02-cifs system-defined BD-202
na2-svm1 lif_na2-02-iscsia disabled BD-1313
na2-svm1 lif_na2-02-iscsib disabled BD-1314
na2-svm1 lif_na2-02-nfs system-defined BD-1315
na2-svm1 lif_na2-svm1_mgmt system-defined BD-1311
9 entries were displayed.

 

na2::*> network port show

Node: na2-01
Ignore
Speed(Mbps) Health Health
Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
a0a Default BD-Default up 1500 -/- healthy false
a0a-1311 Default BD-1311 up 1500 -/- healthy false
a0a-1315 Default BD-1315 up 1500 -/- healthy false
a0a-202 Default BD-202 up 1500 -/- healthy false
e0M Default BD-254 up 1500 auto/1000 healthy false
e0a Default - down 1500 auto/- - false
e0b Default - down 1500 auto/- - false
e0c Cluster Cluster up 9000 auto/10000 healthy false
e0d Cluster Cluster up 9000 auto/10000 healthy false
e2a Default - up 1500 auto/10000 healthy false
e2b Default BD-1313 up 1500 auto/10000 healthy false
e2c Default - up 1500 auto/10000 healthy false
e2d Default BD-1314 up 1500 auto/10000 healthy false

Node: na2-02
Ignore
Speed(Mbps) Health Health
Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
a0a Default BD-Default up 1500 -/- healthy false
a0a-1311 Default BD-1311 up 1500 -/- healthy false
a0a-1315 Default BD-1315 up 1500 -/- healthy false
a0a-202 Default BD-202 up 1500 -/- healthy false
e0M Default BD-254 up 1500 auto/1000 healthy false
e0a Default - down 1500 auto/- - false
e0b Default - down 1500 auto/- - false
e0c Cluster Cluster up 9000 auto/10000 healthy false
e0d Cluster Cluster up 9000 auto/10000 healthy false
e2a Default - up 1500 auto/10000 healthy false
e2b Default BD-1313 up 1500 auto/10000 healthy false
e2c Default - up 1500 auto/10000 healthy false
e2d Default BD-1314 up 1500 auto/10000 healthy false
26 entries were displayed.

na2::*>

TMACMD
4,919 Views

using iSCSI on a port-channel is not against best practices on the NetApp storage.

It is well documented by Microsoft) that it is not supported to use iSCSI from a Windows platform over a port-channel.

 

Go look at any NetApp Flexpod CVD: just about every one of them puts iSCSI on a tagged VLAN on a port-channel

 

Whats your route table look like:

 

"route show"

EricGK
4,913 Views

More info now. 

 

NetApp did not have a default route for the iSCSI network, so it decided that it would send that traffic out one of the default routes which was over the port channel.  The Cisco ACI fabric thought it was ok to route any subnet no matter what vlan it came in on.  Looks like an issue with both ACI and NetApp.   

 

Original route table 

na2::*> route show-lifs

Vserver: na2
Destination Gateway Logical Interfaces
---------------------- ---------------------- ------------------------------
0.0.0.0/0 10.131.1.1 -
0.0.0.0/0 10.254.0.1 cluster_mgmt, na2-01_mgmt1, na2-02_mgmt1

 

Vserver: na2-svm1
Destination Gateway Logical Interfaces
---------------------- ---------------------- ------------------------------
0.0.0.0/0 10.131.1.1 lif_na2-svm1_mgmt
0.0.0.0/0 10.131.5.1 lif_na2-01-nfs, lif_na2-02-nfs
0.0.0.0/0 10.131.20.1 lif_na2-01-cifs, lif_na2-02-cifs

 

Added routes for the iSCSI subnet

na2::*> route show-lifs

Vserver: na2
Destination Gateway Logical Interfaces
---------------------- ---------------------- ------------------------------
0.0.0.0/0 10.131.1.1 -
0.0.0.0/0 10.254.0.1 cluster_mgmt, na2-01_mgmt1, na2-02_mgmt1

 

Vserver: na2-svm1
Destination Gateway Logical Interfaces
---------------------- ---------------------- ------------------------------
0.0.0.0/0 10.131.1.1 lif_na2-svm1_mgmt
0.0.0.0/0 10.131.3.1 lif_na2-01-iscsia, lif_na2-02-iscsia
0.0.0.0/0 10.131.4.1 lif_na2-01-iscsib, lif_na2-02-iscsib
0.0.0.0/0 10.131.5.1 lif_na2-01-nfs, lif_na2-02-nfs
0.0.0.0/0 10.131.20.1 lif_na2-01-cifs, lif_na2-02-cifs

TMACMD
4,911 Views

For what it is worth, an iSCSI best practice is to use a non-routed, isolated VLAN. (no gateway, whenever possible)

If the host is windows, make the access ports the native vlan as needed (so no tag is needed)

On the NetApp side, you are already tagging iSCSI

 

I am also curious (still) about the "route show" command that includes the metrics

 

EricGK
4,902 Views

I agree that iSCSI should not be routed. however if I don't put a route in NetApp, then it finds a default route and goes out another interface which is bad. 

 

Here is an article I found that seems to explain it --  Network traffic not sent or sent out of an unexpected interface after upgrade to 9.2 due to elimination of IP Fastpath 

 

 

TMACMD
4,894 Views

Are your iSCSI hosts  using the same network as your iSCSI targets?

lif_na2-01-iscsia -> 10.131.3.211/24 Are hosts on 10.131.3.0/24 ?
lif_na2-01-iscsib -> 10.131.4.211/24 Are hosts on 10.131.4.0/24 ?

 

If the hosts are on the same network/vlan, then a route should not be needed at all. If it is not working, sounds like a vPC-ish setup issue on ACI.

 

When using regular NXOS (which I know you are not), there is a peer-gateway that is used to facilitate the loss of IPFastpath. Not sure if this exists in ACI or not

Public