What kind of failover times should we expect to see when using a Solaris 11 host with iSCSi against a Cluster? During a test we had two active online paths and did a quick test to filter out all traffic to and from one ONTAP node at a time, we then had writes stall for over a minute before the other online path was used. Is it expected for MPxIO to handle the failure on Solaris or should the Initiator failover before that when we have more than one session?
Also, if we have any kind of software (HUK, DSM, etc.) for host side, usually it just sets timeouts. It really is dependent on yoru specific OS and what it is running. I would get to a supported configuration, and if you really want to take a look it's best to oepn a support case. iSCSI failover should not take that long, but that tells me that the host isn't configured right possibly, not storage.
Thanks for the replies. I tweeked most on the Solaris side, host utilities settings applied, change timeout for the initiator etc but there are not that many things that can be changed. Tested both with a ZFS pool and with just a format inquery that was also hanging for a long time while MPxIO was fiuring things out.
But it should be faster than this, how fast should it be? Should it be the initiator that handles it first and then MPxIO takes the path offline once it has determined it as non functional?
We cannot comment on specifics, but it should be robust enough I'd imagine.
Like I said before, a case might be better but at this point it would be good to have both vendors Solaris and NetApp engaged in a formal case. We are 95% sure this is all Solaris but it would be good to review storage logs and clear any storage issues out.