Subscribe

volume direct access and indirect access

SVM1 has LIF1@node1 and LIF2@node2. If the client is coming to LIF1 and the volume is on NODE1 as well, then it will have direct access, however, if the client is coming to LIF2, to access the volume,  then the traffic will be forward to NODE2. This is indrect access. indicrect access will be slower.

 

I can think of using LIF1'IP not DNS name if I know the volume is located on node1, but, it sounds lack of flexibility, I would have to remmeber which volume is on what node, and then when the volume is moved, IP needs to be changed.

 

My question is, what is a better way to aovid indirect access? 

 

Thank you for sharing!

 

ontap 8.3

Re: volume direct access and indirect access

What protocol do you use?

Re: volume direct access and indirect access

We use NFS most, but some CIFS as well. Thanks for the prompt message.

Re: volume direct access and indirect access

For NFS 4.x it is possible to use referrals or pNFS. Both require client support; referrals redirect open request to the optimal node, while pNFS can dynamically change optimal access path.

 

CIFS supports DFS referrals as well; similar to NFS it happens once on initial open.

 

You can find more information in NFS or CIFS Management Guides.

 

But note that while indirect access may be slower, it does not automatically mean your clients actually notice it for your specific workload. So you should first try to estimate impact of indirect access before trying to optimize for it.

Re: volume direct access and indirect access

What if in NFSv3 situation? Since we don't have a plan to upgrade to v4 anytime soon.

 

> your clients actually notice it for your specific workload

Do you mean the slowness caused by indirect access is little, and not as critidal as optimizing other components?

 

Thank you!

Re: volume direct access and indirect access

I am not aware of solution for NFSv3, sorry (except pure administrative methods of fixed relationships between node/SVM/LIF).

Re: volume direct access and indirect access

Then for instance, the VMWARE ESXi datastore wil have to use LIF1's IP @node1 for the volume?

Re: volume direct access and indirect access

On the indirect access slowness concern - yes that's exactly it.  Optimize everything else first, as using the cluster backplane is already highly optimized in cDoT.

 

I had the same concerns and dove into this with a bunch of my NetApp resources.  On one side I had folk saying not to worry about the backplane.  On the other hand, NetApp put a bunch of redirection features into cDoT that would avoid the backplane.  OnCommand Performance Manager includes the backplane as one factor in performance event analysis.  Clearly there must be situations where the backplane could be a performance factor, otherwise why bother so much with the redirection and data analysis?  So I wanted to know more.

 

The upshot is this: for all protocols the node that receives the request does all the appropriate logical processing to meet the request.  Logical in this case means authentication lookups, mapping of the actual request to the right blocks to be read within a volume, etc. as needed for the request.  For read requests, the node meets the request from any data blocks already in its memory cache.  If blocks are needed, the node figures out what blocks are needed.  If those blocks are not local to the node, a read request for those blocks is sent along the cluster backplane to the node which does own the blocks.   The owning node does the read as appropriate (might also server from cache on that node) and returns the data back to the request processing node.

 

Writes are similar in nature with only the changed block information being passed along the backplane.  The inter-node communication is highly optimized for just this type of communication.  Of course, the cluster backplane is also flush with bandwidth - 20Gbps full duplex between any two nodes minimum, with 40Gbps aggregate recommended for bigger hardware.  

 

So the entire client data request is not shuffled off to a specific node - only the backend disk block data along with cluster control information as needed.  Where protocols are doing purely informational stuff, like establishing sessions and other housekeeping very little has to pass along the backplane.  The first node contacted just handles the request and any future requests for that session (assuming some other redirection method is not in play).  Client communications are not proxied to another node via the backplane.

 

Without a good LIF design - as in multiple LIFs spread across the cluster for file level accesses or zoning that limits LIF visibility to multi-pathing for block protocols - it is possible to create a situation where one node is doing most of the protocol work.  Chances are you create a CPU utilization issue on that node before you hit any appreciable cluster backplane issue, though I have seen warnings on performance where the backplane was the cause of the slowdown.  Remember also that volume moves that cross node boundaries use the cluster backplane.  Similarly CIFS ODX style copies might use the backplane also.

 

cDoT has performance counters to track all the elements of a data request, from the client facing network stack down to the disk performance.  I don't know which counters track the backplane off hand, but they exist.  If you don't use OPM you can certainly research the counters and collect such statistics manually.

 

FYI - I will happily take correction on the cluster backplane communication details where warranted.  Or if there is something that can be clarified, please do.  I've found it somewhat difficult to get really good technical information at this deep a level within cDoT.  Everything above is my understanding of what I've been able to learn.

 

With regard to NFS3 - as indicated in a previous reply doing it manually to direct traffic to the right SVM/volume is the only real option.  Create one data LIF per data store and then set your ESX hosts to access that datastore through a particular IP address.  Then when you move the datastore volume to a different node, move the LIF along with it.  I recommend that you create a specific subnet (physically separate or VLAN) that is at least a /23 so you can have ~512 IP's/datastores.  You might not scale that high in practice.  I'm big on having plenty of breathing room.

 

 

 

I hope this helps you.

 

Bob Greenwald

Lead Storage Engineer

Huron Legal | Huron Consulting Group

NDCA, NCIE - SAN Clustered, Data Protection

 

Kudos and accepted solutions are always appreciated.

 

 

Re: volume direct access and indirect access

[ Edited ]

Thanks Bob for such indepth analysis. I have two follow-ups

 

>>NetApp put a bunch of redirection features into cDoT that would avoid the backplane.

"redirection" seems not going through backplane, differnet than I thought before. Can you please give me a few examples on cases that NetApp redirect traffics and meanwhile avoid the backplane.

 

>>I recommend that you create a specific subnet (physically separate or VLAN) that is at least a /23 so you can have

>>~512 IP's/datastores.  You might not scale that high in practice.

Assuming I have 4 nodes cluster, and 200 datastores. I should create a specific subnet, ex, 10.192.20.x, then each datastore will use one of IP's. What about on the cluster side? Should I create 4 LIF's, one for each node, or create 200 LIF's, 50 LIF's per node, to match each one of datastore? all 400 IP's are in the same subnet/VLAN.

 

 

Thanks again, and looking forward your messages again.

Re: volume direct access and indirect access

"Redirection" from a client's perspective, involving the initial request, takes a couple of forms depending on protocol.  All "redirections" assume that an SVM has multiple data LIFs spread across the nodes that might contain a volume.

 

CIFS redirections, called "Auto Location" in cDoT, happen at the "base" of a share.  Consider a client accessing \\SVM01\Share\Folder\File.  The "base" share in this UNC is \\SVM01\Share.  Assuming that SVM01 has multiple IPs and they are defined in DNS, a client's initial access to the share could come in on any node.  If the volume is on another node, then cDoT can send a DFS style redirection request using the IP address of the LIF defined on the node where the volume exists (say that three times fast).  The client can then send future requests directly to the node where the volume lives.  

 

The limitation of  Auto Location is that it works only using the base share where data is accessed.  If you use junction points and link together a bunch of volumes in a tree like structure, you could logically navigate to a different volume/node combination as you traverse the tree.  Auto Location, if triggered, only happens once using the base share no matter how far down the tree the initial access might be.  As a result it is possible to defeat this redirection through a poor volume junction point/share structure.  Consider a "base share" that contains nothing but junction points to several hundred other volumes.  No matter where those other volumes live in the cluster, Auto Location would redirect all client access to the one node that owns the volume where the base share is defined.  This design actually aggregates all client access to a single node, making one node do most of the CIFS work.  I inherited this exact design.  Even with load-sharing mirrors to break that issue, the design guarantees that on average 75% of all client accesses go to the wrong node first in my primary file sharing cluster.  This condition is my motivation for digging into all of this backplane/redirection stuff.

 

NFS redirections are supported under the Parallel NFS (pNFS) capability when using NFS 4.1.  Obviously client support is needed.  pNFS has a path discovery protocol to allow direct access to the node which controls underlying data storage while communicating to a central node for meta-data and management.  I can't speak more than conceptually about it as I have yet to implement in any real world scenarios.

 

Block protocols - both iSCSI and FCP - are similar to pNFS in that for both there are path discovery mechanisms - ALUA, multi-path software, etc. - to discover all the paths that might access a LUN.  The path discovery mechanism then chooses the best one available.  The "Optimal" path is one directly to the node where the LUN (volume) resides.  Multi-path mechanisms are really just another form of access path redirection.

 

All of these redirections occur between the client and storage with the intent to have the client send data protocol requests directly to the node that "owns" the volume at any given time.

 

 

With respect to your ESX datastore followup - the basic idea is one LIF per volume containing datastores.  These are on top of any LIFS for general cluster management, node management, and SVM management.  So in a four node cluster, you have a cluster management LIF.  You have at least one LIF assigned to each "node" SVM for management.  You've created a data SVM to hold the user data.  That SVM will need a management LIF (best practice).  And now you're going to add 200 datastores.  Asumption is one datastore per volume, so also 200 volumes here.

 

For each datastore volume, create a LIF.  The LIF's "home" node should be the node which owns the aggregate where the volume is created.  The home port can be any appropriate port.  If using VLANs, I suggest a separate VLAN for this "datastore" LIFs as compared to management and more general purpose LIFs.  Makes housekeeping easier.

 

This design may mean that you spread out the datastore LIFs 50 per node, but only if you spread out the datastore volumes 50 per node.  If you put 110 volumes on one aggregate, then the node that owns that aggregate will see 110 LIFs that currently reside on that node as well.  Remember that LIFs are fluid - you can migrate them to any available node/port that supports the LIFs network segment transparently to clients.  So as you move datastore volumes around (assuming you do), then move the LIF at the same time as the volume to the proper node.  I suggest when making a volume move as a permanent choice, both move the LIF and update its permanent home node/port.  Temporary conditions, such as either planned or unplanned node failover, do not require a permanent change to the LIF as everything will go back to normal when the nodes giveback.

 

A key concept is that LIFs are not network ports.  Ports are static on a node.  LIFs can live on top of any port that has the right broadcast domain connectivity.  Assume each node in the cluster has an interface group a0a.  VLAN 100 is available on each interface group at the switch.  So you create a VLAN port a0a-100 on each node.  The "port" for all the datastore LIFs might always be "a0a-100".  The "node" for each datastore LIF will match the node where the volume lives.

 

Technical Reports TR-4067 and TR-4068 have a ton of best practice and background information on NFS scenarios and NFS with ESX.

 

 

 

I hope this helps you.

 

Bob Greenwald

Lead Storage Engineer

Huron Legal | Huron Consulting Group

NCDA, NCIE - SAN Clustered, Data Protection

 

Kudos and accepted solutions are always appreciated.