FlexPod Discussions
FlexPod Discussions
Hi,
we're currently building a 2-node streched metrocluster, based on AFF A300 Controllers.
The customre needs a FC Frontend, so Cisco MDS 9148S switches came into place.
We have big troubles with this setup, because in case of a metrocluster switchover or switchback, the FC LIFs doesn't come up. They are getting stucked with status:
FDISC error - ID could not be acquired for this virtual port.
A manual interface reset on MDS or dis-/enable of affected LIFs will bring the LIF up, but that's not an option.
I've raised a ticket at Cisco for this and it looks like, that we hit a couple of bugs. For example this one: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve72490
I know, there are so many Metroclusters with an MDS SAN frontend outside. Can you please provide me your running versions (NX-OS and Ontap)?
We're currently running on Ontap 9.1P2 and NX-OS 8.1.1 (recommended by Cisco).
Thanks!
Regards from austria.
Falk
Hi,
Please refer KB: https://kb.netapp.com/app/answers/answer_view/a_id/1005173
hi,
thanks for your link.
Unfortunately, I've no permissions to access to article.
I've found a solution by myself and I've written an blog-article about (it's in german, sorry): https://www.heiterbiswolkig.eu/?p=540
Regards
Falk
Hi,
It is a public article.
There is an issue with the link earlier. Try https://kb.netapp.com/app/answers/answer_view/a_id/1005173
hi,
I had a look into the article now. thanks for sharing.
The Problem we had in this particular case isn't described in your link.
The issue we had, was a new default-setting in version 8 of the MDS firmware which is the: flogi quiesce timeout
You have to reconfigure it to zero (flogi quiesce timeout 0) in a 2-Node Metrocluster setup. If now, the LIFs will not switch into operational mode.
Regards
Falk
Hello Falk,
The article is indeed very well summarized.
The default FLOGI quiesce timeout was set to 2000ms as default in CISCO MDS NX-OS 8.1 and 8.2.
The default for 8.3(1) (possibly 8.3) is 0, which disables the feature.
However the Upgrade/Downgrade process follows the logic of, once a value is set, it will not be changed.
Hence, if the MDS had 8.1 or 8.2 installed, it will keep the 2000ms value.
In the MDS guide under Managing FLOGI it is noted:
"This feature must be disabled by setting the timeout value to zero when there are devices in the fabric that can share a pWWN at different times by logging into different switches within the fabric during failover situations."
To summarize why this value caused issues:
- We preserve the path to the LIFs during switchover / switchback, hence we use pWWWN sharing
- "The FLOGI Quiesce Timeout feature causes the FLOGI process to delay notifications to the other Fibre Channel services such as Routing Information and Fibre Channel Name Server when a device logs out from a fabric or when an interface goes down."
If you have further questions, let me know.
Kind regards,
Patrick von Bredow