Hi Eugene and thanks for your reply.
I am using the NetApp DSM, HUK and ALUA is configured. There are 4 sessions to the SVM and the HA is setup properly on the cluster, I have no concerns about that as I have tested failover on both nodes and switch reboots, no issues. There is also an NFS SVM on this cluster and failover is 100% from the ESX hosts as we are using LACP. I'm very certain the configuration on the Windows hosts is correct as well if I can just figure this one kink out, which is not related to the storage at all.
The problem is that ping drops for 30 seconds to both IP targets of the SVM (one LACP team for each node in the 2 node cluster, dispersed between 2 Nexus 3548 using VPC) , so it's not an iSCSI issue I wouldn't think. If the network connection isn't communicating, no mpath configuration in the world is going to work. The problem is Windows waits 30 seconds, I assume it's ensuring the network is down, before flipping to use another NIC on the system. So at this point iSCSI has no path to the storage for that 30 seconds. It seems like Windows sees 192.168.110.180 and 181 coming out of NIC1. Then when the port drops, Windows waits 30 seconds to ensure the networks is down before giving up and finding another path.
TR-3441 Windows Multipathing Options with Data ONTAP: Fibre Channel and iSCSI
Since Server 2012 R2, when using LBFO and not 3rd party teaming software, LACP is supported for iSCSI connectivity (Sec. 5). I have tested this in pre-prod and it works 100%.
The above document also states that when using iSCSI MPIO, NetApp recommends using 2 separate subnets (Sec. 7.1).
TR-4080 as you have shown me does not specifically talk about a Windows host with multiple NICs, it only talks about a NetApp iSCSI target with multiple IPs on the same subnet. The screen shots don't show any source info.
TR-3441 is the only document which discusses the issue I am facing, indicating it's not a recommended configuration.
Following the logic in this MS KB 175767, having 2 adapters on the same subnet in any situation will not load balance and may cause issues.
http://support.microsoft.com/en-us/kb/175767
The SAN configuration guide makes no indication about number of subnets to use. Just a general design of how the network should be cabled. Mine is the fully redundant model on page 10. My interpretation of multiple IP networks is just that, 2 separate networks which would require 2 separate subnets to work properly.
Can't find this document Clustered Data ONTAP SAN Express Setup Guide which pertains to the 8020 or 8040, just 32xx series. I wouldn't expect it to be any different since this document doesn't discuss Windows side in any way other than installing DSM.
Clustered Data ONTAP iSCSI Configuration for Windows Express Guide, no real value to this document I could find. Very basic iSCSI configuration, which also makes no mention of number of subnets to use when you have a fully redundant model.
Very general documentation overall is all that I can find.
If this configuration should work, what's missing if the DSM and HUK are installed? To me this just seems like an unresolvable issue, due to how basic Windows networking works.
Thank you