Solved: New Cluster Nodes Not Showing in NetApp DSM for MPIO

TMADOCTHOMAS · ‎2016-10-10

I've just added two new nodes to our production cluster. I've added two new iSCSI LIFs on the nodes and added those LIFs to the port group. I've used SnapDrive to add connections to the two new iSCSI LIFs on the first of many Windows hosts. After doing this, I checked the Data OnTAP DSM for MPIO software to be sure it was showing 4 nodes per LUN. It is not. It is still just showing two paths. Any ideas as to what I am missing?

bobshouseofcards · ‎2016-10-10

Assuming you are running cDoT 8.3 or higher, you are likely encountering the new "Selective LUN Mapping" feature. This feature was added in cDoT 8.3 to address the multiplication of paths across large clusters that can impact server based total and per LUN path limits.

From the SAN Administration Guide...

Selective LUN Map

Selective LUN Map (SLM) reduces the number of paths from the host to the LUN. With SLM, when a new LUN map is created, the LUN is accessible only through paths on the node owning the LUN and its HA partner.

SLM enables management of a single igroup per host and also supports nondisruptive LUN move operations that do not require portset manipulation or LUN remapping.

Portsets can be used with SLM just as in previous versions of Data ONTAP to further restrict access of certain targets to certain initiators . When using SLM with portsets, LUNs will be accessible on the set of LIFs in the portset on the node that owns the LUN and on that node's HA partner.

Beginning with Data ONTAP 8.3 SLM is enabled by default on all new LUN maps. For LUNs created prior to Data ONTAP 8.3, you can manually apply SLM by using the lun mapping remove-reporting-nodes command to remove the LUN reporting nodes and restrict LUN access to the LUN owning node and its HA partner.

I ran into the same situation myself when adding nodes 5 and 6 to a cluster, moving some LUNs there, and then not seeing direct paths. Mine was FCP, but since this applies at the LUN mapping level it would be just as applicable to iSCSI as well.

The commands of interest to you are:

lun mapping show -field reporting-nodes -vserver <SVM> -path <lun-path>

lun mapping add-reporting-nodes ...

lun mapping remove-reporting-nodes ...

SLM provides for very specific path control and in conjunction with portsets can limit path proliferation especially as the number of potential ports increases and the number of allowed hosts in a SAN cluster increases with refreshed hardware and ONTAP 9.1.

Hope this helps.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

Kudos and accepted solutions are always accepted.

View solution in original post

bobshouseofcards · ‎2016-10-10

Assuming you are running cDoT 8.3 or higher, you are likely encountering the new "Selective LUN Mapping" feature. This feature was added in cDoT 8.3 to address the multiplication of paths across large clusters that can impact server based total and per LUN path limits.

From the SAN Administration Guide...

Selective LUN Map

Selective LUN Map (SLM) reduces the number of paths from the host to the LUN. With SLM, when a new LUN map is created, the LUN is accessible only through paths on the node owning the LUN and its HA partner.

SLM enables management of a single igroup per host and also supports nondisruptive LUN move operations that do not require portset manipulation or LUN remapping.

Portsets can be used with SLM just as in previous versions of Data ONTAP to further restrict access of certain targets to certain initiators . When using SLM with portsets, LUNs will be accessible on the set of LIFs in the portset on the node that owns the LUN and on that node's HA partner.

Beginning with Data ONTAP 8.3 SLM is enabled by default on all new LUN maps. For LUNs created prior to Data ONTAP 8.3, you can manually apply SLM by using the lun mapping remove-reporting-nodes command to remove the LUN reporting nodes and restrict LUN access to the LUN owning node and its HA partner.

I ran into the same situation myself when adding nodes 5 and 6 to a cluster, moving some LUNs there, and then not seeing direct paths. Mine was FCP, but since this applies at the LUN mapping level it would be just as applicable to iSCSI as well.

The commands of interest to you are:

lun mapping show -field reporting-nodes -vserver <SVM> -path <lun-path>

lun mapping add-reporting-nodes ...

lun mapping remove-reporting-nodes ...

SLM provides for very specific path control and in conjunction with portsets can limit path proliferation especially as the number of potential ports increases and the number of allowed hosts in a SAN cluster increases with refreshed hardware and ONTAP 9.1.

Hope this helps.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

Kudos and accepted solutions are always accepted.

TMADOCTHOMAS · ‎2016-10-10

Very helpful! Thank you very much Bob.

TMADOCTHOMAS · ‎2016-10-10

Bob, one more question: from what I can tell the -aggregate and -volume options are optional. I simply want each LUN in a given SVM to have access to two new nodes in the cluster, regardless of aggregate. Am I correct that I can leave these switches off? Also if I enter a common path name with a wildcard for the -lun option, will that work? For example:

lun mapping add-reporting-nodes -vserver <vserver> -path <Server_Name>_* -igroup <Igroup-Name>

bobshouseofcards · ‎2016-10-10

Your comand fragment is enough to identify the particular LUN/igroup combinations of interest. You'll need one of the four optional parameters to select which nodes to add to the list...

-local-nodes [ true ] adds the nodes where the LUN currently exists to the reported path map.

-destination-aggregate <aggregate name> adds the nodes where the listed aggregate lives. Typically used before a volume would be moved to a new aggregate. Could also be used with the name of an aggregate on your new nodes - nothing says the volume has to be moved there.

-destination-volume <volume name> adds the nodes where the listed volume lives. Typically used before a LUN is moved to a new volume (lun move start ...). Could also be used with the name of a volume on your new nodes - nothing says the LUN has to be moved there.

-all adds all the nodes in the cluster. This option is available at "advanced" privilege level (set -priv adv). Gets a bit scary in large clusters.

One other thought: remember for path management the only paths you really need are the ones to the HA pair where the aggregate/volume/LUN live. If one node fails, the aggregate flips over to the other node and you still have direct paths. If both nodes fail the aggregate is offline. Having non-optimized paths through other nodes in the cluster is only needed if you suspect that all the iSCSI data paths to the owning HA pair could fail at the same time. But, if your network/system design is such that you might lose all those paths at the same time, would it also take down the paths you just added for the new nodes? If so, having the additional paths does not add protection but does add overhead at both ends (small, but adds up at scale).

Hope this helps.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

Kudos and accepted solutions are always appreciated.

TMADOCTHOMAS · ‎2016-10-11

Thank you very much Bob, and yes it definitely helps. The reason for this design is that we are planning to move all of our iSCSI volumes/LUNs to the two new nodes in the cluster (they are AFF nodes).

Based on your description of the four options, it sounds like my options are "destination-aggregate" or "all". I don't really understand "local-nodes" - wouldn't the nodes that currently contain the LUNs already be "reporting nodes"? Mine are. Do you need to "add" nodes that are already applied when you are making this kind of change? My concern about the "all" option is I'm concerned about the EXISTING reporting nodes - I don't want to adversely affect them somehow.

Here are a couple possible ways for me to do this based on my understanding. Does this look correct?

lun mapping add-reporting-nodes -vserver <vserver> -path <Server_Name>_* -igroup <Igroup-Name> -destination-aggregate <aggregate_on_node_3>

lun mapping add-reporting-nodes -vserver <vserver> -path <Server_Name>_* -igroup <Igroup-Name> -destination-aggregate <aggregate_on_node_4>

lun mapping add-reporting-nodes -vserver <vserver> -path <Server_Name>_* -igroup <Igroup-Name> -all

aborzenkov · ‎2016-10-11

LUN move is actually covered pretty well in documentation (see SAN Administration Guide, Modifying SLM reporting nodes). When you move LUN to new aggregate that is hosted on different nodes, you need to add these nodes to the set of reporting nodes. Using -destination-aggregate eliminates need to know these nodes' names directly - you usually know destination aggregate/volume already. As long as you have just 4 nodes in total, this is entirely equivalent to using "all". But it makes difference if you have large cluster.

After move is complete you may remove old nodes using remove-reporting-node -remote-nodes.

TMADOCTHOMAS · ‎2016-10-11

Thanks aborzenkov. Do you or Bob know if there's a preferred / recommended method? I'm slightly concerned about using "all" since that includes the existing reporting nodes. Do either of you know if "all" only affects nodes that aren't currently applied?

TMADOCTHOMAS · ‎2016-10-11

In particular, the description of "all" in the man pages is:

------------------------------------------------

Set the LUN mapping to report on all nodes in preparation for a revert to a previous version of Data ONTAP.

------------------------------------------------

That doesn't seem right. I prefer "all" because it's simple, if indeed it is safe to use.

If I use "destination-aggregate", does it truly matter which aggregate I choose as long as it's on the applicable node? In other words, could I choose an aggregate on node 3 for all LUNs but actually move some to node 3 and some to node 4?

TMADOCTHOMAS · ‎2016-10-11

One last question re: syntax:

For the LUN path, do I need to do this (<servername>* are the volumes, <servername>*.lun are the LUNs):

-path /vol/<servername>*/<servername>*.lun

or will this work:

-path /vol/<servername>*

bobshouseofcards · ‎2016-10-11

Using "all" for a 4 node cluster is not likely to be an issue. Using "all" on a 8+ node SAN cluster where each SVM might have multiple target FC/iSCSI addresses on each node might just blow a server's pathing limitation out of the water.

For that reason only I suggest getting in the habit of not using "all". At some point you may have a few more nodes and it can being to matter - better to develop good habits now. When doing a volume/lun move, before the volume move add in the reporting nodes of the destination aggregate/volume. That particular operational pattern is how SLM is conceived.

On the "local-nodes" option you need it for remove for sure as part of the move sequence. Consider an order of operations - add new, move vol, remove old. How do you identify the old? Can't be by aggregate or volume location since the data isn't at the old location anymore. So the sequence, if you will remove reporting nodes, is "add new, remove local, move vol/lun". Obviously you'd want to verify that the new paths are working before removing the local nodes else data access might be impacted.

As with you not sure how local-nodes would not be reporting nodes, but I can imagine as a security mechanism or perhaps a testing mechanism a LUN can be created and mapped but "turned off" by removing all reporting nodes. Then you might want to add in only the local nodes to turn it back on.

Similarly, when converting from 8.2 to 8.3 one might take an outage window to address path proliferation issues by removing all reporting nodes from a LUN and then adding the local nodes back in.

Hope this helps.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

Kudos and accepted solutions are always appreciated.

aborzenkov · ‎2016-10-11

@bobshouseofcards wrote:
How do you identify the old? Can't be by aggregate or volume location since the data isn't at the old location anymore. So the sequence, if you will remove reporting nodes, is "add new, remove local, move vol/lun".

Actually easier is - move LUN, add new local paths using "lun mapping add-reporting-nodes -local-nodes true", remove old (now remote) paths using "lun mapping remove-reporting-nodes -remote-nodes true". This sequence does not require you to identify source or destination at all and can be applied to any LUN anytime.

TMADOCTHOMAS · ‎2016-10-12

Hey aborzenkov,

Regarding this:

------------------------

Actually easier is - move LUN, add new local paths using "lun mapping add-reporting-nodes -local-nodes true", remove old (now remote) paths using "lun mapping remove-reporting-nodes -remote-nodes true". This sequence does not require you to identify source or destination at all and can be applied to any LUN anytime.

------------------------

From what you are saying, you can actually move the LUN to the new nodes without having added the reporting nodes yet. Is that true? What would be the consequences of doing that? i was assuming the connection to the LUN would fail without the reporting nodes being added.

Also: I don't see a "remote-nodes" option. I am on 8.3.2 so I don't know if that's a new 9.0 option or not, but it's not in my version.

aborzenkov · ‎2016-10-12

@TMADOCTHOMAS wrote:
What would be the consequences of doing that?

Data would flow across cluster interconnect, so likely increased latency.

@TMADOCTHOMAS wrote:
Also: I don't see a "remote-nodes" option. I am on 8.3.2 so I don't know if that's a new 9.0 option.

It is listed in 8.3.2 documentation. I do not have live system to check. Try in advanced mode.

TMADOCTHOMAS · ‎2016-10-12

i see it now. I was looking at the "add-nodes" command options, not "remove-node".

TMADOCTHOMAS · ‎2016-10-12

-------------------

Data would flow across cluster interconnect, so likely increased latency.

-------------------

That's why i want to add the reporting nodes in advance.

bobshouseofcards · ‎2016-10-12

You'd be surprised how little latency, for a single workload, that using the Cluster Interconnect during the transfer adds. The transfer process itself probably adds more simply because there are multiple data consumers accessing the data.

Unless you have significant workload using the Cluster Interconnect or you have very tight performance requirements, changing the reporting nodes before or after isn't going to make a huge difference. Remember - the node that "owns" the LUN doesn't change during the transfer - it changes at the end of the transfer. Makes sense of course since if you cancel the move the source still has to be fully up to date.

A vol move is for all intents a SnapMirror move, update-cycle until really close, "offline", final update, break, online, teard down mirror, delete the original volume, rename the new volume all wrapped into a convenient job. The key element is the "update-cycle until really close". ONTAP volume moves have a cutover time allowed for the "offline final-update break online" sequence of 45 seconds as I recall (would have to check) during which standard configuration at the client server is expected to just wait on I/O anyway and ride through the cutover time. The "update-cycle until really close" is the same as multiple SnapMirror updates to get the state of the move really close to perfect to minimize the final update time during cutover. If the cutover time exceeds the allowed time, the source is brought back to normal online state and the vol move job goes back to the "update-cycle until really close" phase.

So even if you add the paths ahead of time, you'd still be dealing with the increased load on the source during the move and the upcoming brief "offline" time when the cutover happens. The slight latency added by using the CI after the move completes but before you'd update pathing could be less than all of that.

Hope this helps.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

Kudos and accepted solutions are always appreciated.

TMADOCTHOMAS · ‎2016-10-12

Bob,

Thanks again for your helpful comments. So to be sure I am following, is this an accurate summation?

I can move volumes from, say, node 1 to the new node 4, without changing reporting nodes.

After the move is complete, the LUNs stay connected via the LIF on node 1 and thus use an indirect path temporarily.

I then add node 3 and 4 as reporting nodes, and the Data OnTAP for MPIO software changes the connection to the new preferred path on node 4.

Here is what threw me off:

1. The Data OnTAP for MPIO software isn't showing nodes 3 and 4 as indirect paths. It's not showing them as paths at all. It only shows paths for nodes 1 and 2. So, when the LUN lands on node four, how will it switch connections to the path on node 4 when that path doesn't show up on the MPIO software? I assumed that adding reporting nodes was required for the new nodes to show up. (NOTE: I have successfully added the iSCSI connections to nodes 3 and 4 in SnapDrive's iSCSI Management pane).

2. I don't think I have a clear understanding of what a "reporting node" is in this context. Again, I was under the assumption it was required to be in place before migrating a LUN, but that is apparently not the case.

bobshouseofcards · ‎2016-10-12

Can help with clearing up the confusion on these questions.

A reporting node is a node through which a LUN is accessible over a LIF. When a SAN client is zoned (FCP) or has a defined session (iSCSI) and to a storage device, the SAN client can query "what LUNs do you have". The storage device will consult its mapping tables as appropriate and report back LUN ids that match the SAN client. In NetApp ONTAP, a "reporting node" is a node which will say (report) "I have this LUN for you". Nothing more.

Prior to 8.3, all nodes would report every LUN they knew about. The only limiting factor was portsets - that would limit which LIFs could report which LUNs. This is from the old 7-mode style reporting as an attempt to limit the number of paths a LUN might have because SAN clients had per client and per LUN path limits that were rather low compared to today.

Now, between ports and number of nodes, portsets are often not the right granularity or limiting enough. So the concept of which nodes would answer the LUN query from a SAN client for any given LUN is what "reporting nodes" is all about.

To address your points - three things are needed to show new paths:

1. The SAN client needs a session to the new nodes.

2. The nodes have to be marked as "reporting nodes" for a LUN to say that a SAN client can use the new sessions to see its LUNs

3. The SAN client needs to initiate a query cycle to see what LUNs are on what paths (either via periodic refresh or manual disk discovery).

When a volume move occurs is not relevant to those three items so long as the paths that are in use before the move remain available after the move. With regard to pathing, an optimized path exists when the LUN is on a reporting node. Any other path, including to the HA partner of the owning node, is a non-optimzed path because it has to use the cluster backplane to access the physical data.

So long as one path exists, you can access the data. Obviously you want to have more than one, but so long as one is good you can always configure more through iSCSI sessions, portsets, and reporting nodes as you need. The big deal about reporting nodes is that it is new. The default in 8.2 and below was that all nodes automatically report, so just addding a new iSCSI session to a LIF on a new node was sufficient. Now there is an additional layer to consider.

If its any consolation, well after an update from 8.2 to 8.3 code levels on a SAN cluster I managed, we expanded the cluster and started modifying portsets for path control and did a bunch of LUN movements. While I had read about SLM (Selective LUN Mapping - the official name of this reporting node feature) as part of prep for the upgrade, I promptly forgot about it until months later when I ran into the exact same situation you have. The default from 8.2 was all nodes report - so long as I had the same 4 nodes and the same portsets, nothing mattered during LUN moves. As soon as I added nodes and started redoing portsets, encountered all the same things you are seeing until I reviewed again what SLM actually is and does. Just had to adjust some workflows. And no doubt it will be on the test when I have to renew my SAN certification later this year.

Hope this helps.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

Kudos and accepted solutions are always appreciated.

TMADOCTHOMAS · ‎2016-10-12

Bob,

This is immensely helpful. We started on 8.3 so I didn't even know about the transition from 8.2 to 8.3. I feel like I have a good handle on the purpose of reporting nodes now. We use portsets as well.

If I'm understanding correctly, using my earlier illustration, if I move a volume/LUN from node 1 to node 4 without adding node 4 as a reporting node, then Data OnTAP's DSM for MPIO will still connect to the LUN over the LIF on node 1, but it will now show the connection as a non-optimized path. In fact, both nodes 1 and 2 will show as non-optimized. Only when I add node 3 and 4 as reporting nodes will they show up in the MPIO software and show that node 4 is now the optimized path. Is that an accurate summary?

bobshouseofcards · ‎2016-10-12

Yup. That's exactly correct.

Bob Greenwald

Senior Systems Engineer | cStor

NCIE SAN ONTAP, Data Protection

New Cluster Nodes Not Showing in NetApp DSM for MPIO

I2A Registration is Open!