Re: Aggregate separation for multiple uses

SEASTREAK · ‎2011-04-21

Background

We are preparing to install a NetApp cluster to support a new implementation of our application environment. We will access the storage in three different ways, all supporting the same broad application:

1) NFS access

2) VMWare disk (VMDK) provisioning over NFS

3) Fibre Channel LUNs

In our existing environment, we already have the same arrays. Right now, we are using one filer-pair for NFS access, a separate pair for VMDK provisioning, and a non-NetApp SAN for the FC LUNs. An important point to include is that we had originally provisioned the (new) VMDKs on the same filer-pair as our (existing) NFS stores, but we were having some performance issues and didn't want to jeapordize the NFS storage, and ended up purchasing another filer-pair just for VMDK's.

Question

When we configure our Aggregates on the new array, is there any reason to separate them based on the usage/access mode? We're planning to do this, primarily for problem isolation [i.e. it's political :-)]. We would leave some disks out of any Aggregate at first until we see where expansion may be needed.

Is there any technical basis that supports separation based on aggregates? Are there any other advantages to keeping them separated? Also, will we be able to easily add one or two disks to an aggregate when needed?

Are there advantages to using a single Aggregate for all volumes regardless of usage/access method? [I realize there's also the discussion whether vol0 for ONTAP should be provisioned in its own aggregate, that's a separate issue.]

Thanks,

Jeff

baijulal · ‎2011-04-21

I would suggest reading TR-3838 available on now site.

Also you can search for best practice documenations at http://www.netapp.com/us/library/ (use key word search)

Some considerations that would make are

Number of spindles in Aggregate (say for example 2 different aggrs with say RAID-DP and 3 disks each is not a good idea;smaller number of disks on Aggr can lead to bottlenecks from disk i/o performace)

VMWARE VOLs are typical candidate for dedupe savings so having them on a seperate aggregate is benificial.

So depending on onumber of disks I might consider 2 -3 Aggrs in this case.

Also you need to ensure that VMDKs are aligned (look for best practice docs acround this subject).

Adding disk to aggregate is easy and quick but keep in mind there is no option to remove a disk from Aggr.

I would also prefer to have all disks added in Aggrs initially before volumes are created, rather than expanding them later.

SEASTREAK · ‎2011-04-22

Baijulal,

Thanks for your reply. See my responses below.

I would suggest reading TR-3838 available on now site.

Also you can search for best practice documenations at http://www.netapp.com/us/library/ (use key word search)

Someone else pointed out that TR-3838 is an internal document only -- I don't find it when I search for it. Also, I could just be missing it, but I don't find a Best Practice document that describes creating separate Aggregates based on how the data will be accessed or used.

Number of spindles in Aggregate (say for example 2 different aggrs with say RAID-DP and 3 disks each is not a good idea;smaller number of disks on Aggr can lead to bottlenecks from disk i/o performace)

This is definitely a concern. I think this is the main reason we would want to use either a single or small number of large aggregates, regardless of the usage.

VMWARE VOLs are typical candidate for dedupe savings so having them on a seperate aggregate is benificial.

Yes, we dedupe the VMware volumes and not the others. Deduplication runs on a volume basis, correct? Is it more compute intensive or I/O? If compute, that will effect us regardless of whether the volumes are on separate aggregates. If I/O, that could be a valid reason to use a separate aggregate for the VMDK usage so as not to effect the others.

So depending on onumber of disks I might consider 2 -3 Aggrs in this case.

I'm not sure what the 2-3 number is based on. If separate aggregates for different usage is not beneficial (or worse if it is harmful), we will follow best practices for number of aggregates based on the number of disks/raid groups.

Adding disk to aggregate is easy and quick but keep in mind there is no option to remove a disk from Aggr.

I would also prefer to have all disks added in Aggrs initially before volumes are created, rather than expanding them later.

The removal issue is valid but I don't think we're too concerned about it. I'm more worried about performance issues of recalculating parity and new data being directed to the empty disk(s). Is that why you'd rather not expand an aggregate later?

shaunjurr · ‎2011-04-23

Hi,

There are probably other TR's on capacity planning and such, but in general, having separate aggregates for different usage probably has no real value in most cases. Your basic "exportable" unit is going to be the volume or volume+qtree pair anyway.

I/O is going to be a matter of spindles up until you saturate the available bus/cpu I/O on the controller "head" anyway, so you might as well go for larger aggregates. If you get the chance to look at spec.org benchmarks for NetApp, then you will see that raidgroup sizes probably don't exceed 18 disks (although I have used up to 28 where the access patterns and storage utilization requirements made it feasible) and aggregates had large numbers of disks (50+). Multiple raidgroups (a parity grouping) in an aggregate are of course, possible, up to the maximum aggregate size (which varies depending on if you are using ONTap 8.x, >= 7.3.3 (iirc) or < 7.3.3). Larger aggregates give you a bit more flexiblity and the only real downside is a large scale problem with the disk subsystem (which I have seen happen, but is highly unlikely) that could perhaps have more impact on a large aggregate than multiple smaller aggregates. Multi-pathing your disk shelves can help here depending on the type of failure and what your possiblities (number of disk ports) and requirements are.

Deduplication up to this point is a volume operation. It can certainly be I/O intensive up to a point (starting it), but after that it just checks new data for existing duplicate blocks and also deduplicates new duplicate blocks after their fingerprint (and subsequent sanity checks) as been added for that volume.

As Baijulal hinted add, expanding aggregates on the fly can be a source of I/O bottlenecks because of how WAFL distributes data across disks. New disks will be "filled" until they are like the remaining disks. If too few disks are added, they will become "hot" disks because all new data will be added there and might even remain there if the data change rate is low. This means you will suffer under some strange I/O limits for certain operations involving this "new data". Thankfully, NetApp has the "reallocate" command to alleviate such problems (not something you find from a lot of storage vendors).

Setting up new systems is always a bit of a black art because you rarely know exactly how things will evolve over time and making changes later on can be problematic, depending on your SLA requirements, host (consumer) systems, etc. You haven't really said much about the type of systems that you are using or what type or number of disks.

I/O from disks is going to become more and more of a problem over time generally as disks get larger in size but generally retain their I/O per second performance. At some point most of us will have to start using PAM modules to be able to get enough disk utilization where I/O requirements are more demanding.

Good luck.

mcope · ‎2011-05-06

It sounds like some of the political issues you're having to deal with when designing your new storage layout probably center on Quality of Services concerns like:

1. Keeping group A from taking up too much space and starving group B

2. Preventing group B from using so much I/O that group A experiences latency

To solve issue #1, you can implement quotas to restrict how much space a user or group is allowed to use

To solve issue #2, as others have mentioned go with large aggregates to spread I/O over multiple disks, but then implement FlexShare to throttle I/O by volume