Solved: Does root volume\aggregate need to be separate from data aggregates in cDOT?

cgeck0000 · ‎2015-05-26

This is my first rodeo with Clustered Data ONTAP after years of 7-Mode.

We have a new FAS8040 2-node cluster with cDOT 8.3.

I seperated out the disk ownership with SATA on one node and SAS on the other node.

I see both nodes grabbed the 3 disks for ONTAP root volume and aggregate. I also see that System Manager says no Data Aggregates.

So my question is... Does cDOT force you to have a dedicated root aggregate for the root volume?

I know this has always been best practices and there has always been a debate on whether it should or should not be. I never configured things that way with 7-Mode and would always have everything together and never had an issue.

Thanks!

bobshouseofcards · ‎2015-05-26

Since you mentioned that this is your first go round with cDot, here are some key concepts that are different from 7-mode.

Clustered DoT is like 7-mode at the edges - there are physical stuff like ports and disks and aggregates that are owned by a particular node (controller, head - pick your term). There are volumes and shares and LUNs that are exposed to the outside world. In the middle is the new creamy filling. A virtualization layer is always present that takes all the physical stuff in a cluster and presents it as a single pool of resources that can then be divied up into the logical Storage Virtual Machines (previously - vServers).

SVMs for all intents are like the vFilers of 7-mode. In cDot 8.3 you can have IPspaces, or not. You overload virtual network interfaces and IP addresses onto the physical network ports. New in cDot compared to 7-mode you can also overload virtual FVC WWNs onto the physical FC ports, something a vFiler couldn't do. For all intents, think in terms of vFiler.

Remember in 7-mode that when you used "MultiStor" to add vFiler capability, there was always a default "vfiler0" which represented the actual node itself. Aggregates, disks, and ports were controlled through vFiler0 as the owner of physical resources.

So the big switch in cDot is that you're always in "MultiStor" mode and that "vFiler0" is reserved for control use only. You can't create user volumes and define user access to vFiler0. Instead you have to create one or more user "vFilers" where logical stuff like volumes and LUNs and shares and all that get created.

More implications of this design. Each node needs a root volume from which to start operations. Remember in 7-mode that the root volume held OS images, log files, basic configuration information, etc. The node root-volume in cDot is pretty much the same, except it cannot hold any user data at all. The node root volume needs a place to live, hence the node root aggregates. Each node neads one, just like in a 7-mode HA pair. Yes, the only contents of the node root aggregates are the node root volumes. And they are aggregates, so at least 3 disks. Suggestion for a heavily used system is actually to use 5 disks to avoid certain odd IOPs dependencies on lower class disk. The node root volume will get messages and logs and all kinds of internal operational data dumped to it. I have experienced, especially when using high capacity slower disks, that node performance can be constrained by the single data disk performance of a 3 disk root aggregate, so I have standardized on 5 for my root aggregates. Now, for my installation, 20 disks (4 node cluster) out of 1200 capacity disks isn't a big deal. A smaller cluster can certainly run jsut fine with 3 disks. Similarly, because I want all my high speed disks available for user data, I purposely but some capacity disks on all nodes, even it they just server the root aggregate needs. Again, my installation allows for it easily, your setup may not.

So yes - root aggregate is one per node and you don't get to use it for anything else. Not a best practice question - it's a design requirement for cDot.

About the load sharing mirrors. Here is where we jump from physical to logical. After you have your basic cluster functional, you need to create SVMs (again, think vFilers) as a place for user data to live. Just like a 7-mode vFiler, an SVM has a root volume. Now this root volume is typically small and contains only specifics to that SVM. It is a volume, and thus needs an aggregate to live in. So you'll create user aggregates of whatever size and capacity meets your needs, and then create a root volume as you create your SVM. For instance, let's say your create SVM "svm01". You might then call the root volume "svm01_root" and you specify what user aggregate will hold it.

For file sharing, cDot introduces the concept of a namespace. Instead of specifying a CIFS share or an NFS export with a path like "/vol/volume-name", you instead create a logical "root" mount point and then "mount" all your data volumes into the virtual file space. A typical setup would be to set the vserver root volume as the base "/" path. Then, you can create junction-paths for each of the undelrying volumes, for instance create volume "svm01-data01" and mount it under "/". You then could create a share by referencing the path as "/svm01-data01". Unlike 7-mode, junctions points can be used to cobble together a bunch of volumes in any namespace format you desire - you could create quite the tree of mount locations. It is meant to be like the "actual-path" option of 7-mode export shares by creating a virtual tree if you will, but it doesn't exactly line up with that funcitonality in all use cases.

Of course, if you are creating LUNs, throw the namespace concept out the window. LUNs are always referenced via a path that starts with "/vol/" in the traditional format and the volumes that contain LUNs don't need a junction-path. Unless of course if you want to also put a share on the same volume that contains a LUN...then to setup the share you need a namespace and junction-paths. Confusing? Yes, and it is something I wish NetApp would unify at some point, as there are at least four different ways to refer to a path based location in cDot depending on context, and they are not interchangeable. That and a number of commands which have parameters with the same meaning but different parameter names are my two lingering issues with the general operation of cDot. Sorry - I digress.

So - why the big deal on namespaces and how does that apply to load sharing mirrors? Here's the thing. Let's assume you have created svm01 as above. And you give it one logical IP address on one logical network interface. All well and good. That logical address lives on only one physical port at a time, which could be on either node. Obviously you want to setup a failover mechanism so that the logical network interface can failover between nodes and function if needed. You share some data from the SVM via CIFS or NFS. A client system will contact the IP address for the SVM and that contact will come through node 1 for instance if a port on node 1 currently holds the logical interface. But, for a file share, all paths need to work through the root of the namespace to resolve the target, and typically the root of the name space is the SVM's root volume. If the root volume resides on an aggregate owned by node 2, all accesses to any share in the SVM, whether residing on a volume/aggregate in node 1 or 2, must traverse the cluster backplane to access the namespace information on the SVM root on node 2 and then bounce to whatever node the target volume lives on.

So, let's say we add a 2nd netowrk interface for SVM01, this time by default assigned to a port that lives on node 2. By DNS round robin we now get half the accesses going first thorugh node 1 and half through node 2. Better, but not perfect. And there remains the fact that the SVM's root volume living on node 2 still becomes a performance choke point if the load gets heavy enough. What we really want is for the SVM's root volume to kinda "live" on both nodes, so at least that level of back and forth is removed. And that is where load sharing mirrors come in.

A load sharing mirror is a special kind of snapmirror relationship where an SVM's root volume is mirrored to read only copies. Because most accesses through the SVM's root volume are read only, it works. You have the master SVM root, as above called "svm01_root". You can create replicas, for instance "svm01_m1" and "svm01_m2", each of which exists on an aggregate typically owned by different nodes (m1 for mirror on node 1, m2 for mirror on node 2). Once you initialize the snapmirror load sharing relationship, read level accesses are automatically redirected to the mirror on the node where the request came in. You will need a schedule to keep the mirrors up to date, and there are some other small caveats. Is this absolutely required? No, it isn't. The load/performance factor achieved through use of load-sharing mirrors is very dependent on the total load to your SVMs. A heavily used SVM will certainly benefit. It can sometimes be a good thing, other times it can be a pain. The load sharing Snapmirror works just like a traditional snapmirror where you have a primary that can be read/write and a secondary shared as read only. The extras are that no snapmirror license is needed to do load sharing, load sharing mirrors can only be created within a single cluster, and any read access to the primary is automatically directed to one of the mirrors. Yes - you should also create a mirror on the same node where the original exists, otherwise all access will get redirected to a non-local mirror, which defeats the purpose.

You will also want to review the CIFS access redirection mechanisms whereby when SVMs have multiple network interfaces across multiple nodes a redirect request can be sent back to a client so that subsequent accesses to data are directed to the node that owns the volume without needing to traverse the backplane. Definitely review that first before putting a volume/share structure in place because you can defeat that redirection if you aren't careful with your share hierarchy.

Hope this helps with both some general background as you get up to speed on cDot and some specifics in response to your topic points.

Bob Greenwald

Lead Storage Engineer | Huron Legal

Huron Consulting Group

NCDA | NCIE-SAN Clustered Data OnTap

View solution in original post

JGPSHNTAP · ‎2015-05-26

IMHO, Yes, root gets it's own aggregates

And make sure you setup your load-sharing mirrors on all root volumes.

cgeck0000 · ‎2015-05-26

Load-sharing mirrors on all root volumes?

bobshouseofcards · ‎2015-05-26