2015-12-06 02:53 AM
2015-12-06 09:03 AM
In 7-mode, FCP traffic to the HA pair can route one of two ways. The data request can come in on an FC port directly to the node which owns the aggregate and volume, or it can come in on an FC port on the partner node. A LUN can be presented via both nodes to switches and client servers and zoned that way so there are data paths through both nodes. This is required to ensure data availability to a LUN in case the owning node fails. If there are no paths through the partner node, access is lost when the owning node fails.
When a request comes in via the partner node, the partner node hands it to the owning node via the node interconnect bridge connection (vtic). The owning node will process the request than hand any needed data/response back to the partner node through the bridge. The partner node will then respond to the original request.
The message you are getting indicates that a client server is accessing a LUN by sending the request to the partner node and the interconnect bridge is being used to fulfill the request.
There are two issues associated with the message. One, the interconnect bridge is designed for high speed transfer of data between the NVRAM of the two nodes. While it can also be used for this type of end user data access, it is not optimized for doing so. The condition adds latency to the individual client server request. Also, if a lot of it is going on, the write performance of the HA pair can be affected.
Second, the message tells you there is something wrong in your environment. There should be a minimum of two SAN paths to storage, one to each node, in the most basic configuration to have some level of redundancy in the configuration all the way to the client server. With one path per node, there should be no reason to use the node bridge interconnect. Since DoT is detecting that someone is using it, one of the following conditions has occurred:
1. Physical failure in the path - FC port on Client server, SAN switch, or NetApp node, or cabling.
2. Zoning failure - All WWNs aren't zoned in the switch correctly so that one physical path to the LUN isn't seen logically.
3. iGroup failure - a WWN from the client server isn't listed properly in the igroup to which the LUN is mapped
4. Portset failure - if using portsets, these limit which igroups see which LUNs on which ports. A portset might limit a LUN from being seen down the expected physical path
5. Client MPIO failure - Ensure the client has the proper multi-pathing software installed and configured. Proper use of NetApp host connection software is advised.
Any of those items could cause the host to use the partner path and thus trigger the message.
A place to start might be to display the igroup of the affected LUN (or all igroups) and see where WWNs are listed as "not loggged in". That may help you determine if you have a single host based issue or a larger design issue in your configuration.
I hope this helps you.
Lead Storage Engineer
Huron Legal | Huron Consulting Group
NCDA, NCIE - SAN Clustered, Data Protection
Kudos and accepted solutions are always appreciated.
2015-12-06 07:07 PM
Refer Kb https://kb.netapp.com/support/index?page=content&id=3010111 for more information.