When we bought our FAS2240-2 we bought a DS2246 all with 600GB 10K SAS drives a total of 48 600GB drives. It was setup with 3 drives for the OS aggregate, 1 spare, and 20 for the data aggregate which has a 20 disk RAID group for each HA.
We just purchased another DS2246 with all 600GB 10K SAS drives. I want to add all the drives to one existing aggregate, but the RAID group size does not make sense. I know it is not recommended to go about a 20 disk RAID group, but would it be okay to go to 24 disk RAID group? The reason I want to go to 24 is because we may add another DS2246 in the near future. Or would it make more sense to go decrease our existing RAID group to 16 disks?
Solved! See The Solution
You can't shrink an aggregate or a RAID group once created.
Going out to 24 disks is stretching the rules about sticking close to the default size of 16 disks.
Best practice is to have at least two spare drives of each type.
To be conservative, you could add another 20 disk RAID group to an existing aggr.
Or create a new aggr with 22 disks in one RAID group, and have 3 spares.
Or add a 22 disk RAID group to an existing aggr...
I hope this response has been helpful to you.
At your service,
Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)
I know the aggregate that has 20 disks in it will not be affected by changing the RAID group to 16, but it will let me create a 16 disk RAID group + a 7 disk RAID group with a spare. Then when I add another DS2246 later then I can add 9 disks to the 7 disk RAID group to make it 16 and then have a 14 disk RAID group with 1 spare.
I thought about increasing it to 22 disk in a RAID group, but would that cause any performance issues? I understand if a drive failes that it will cause a performance issue when it is rebuilding, but drive failures are usually rare. The system is running none production VMs, so a performance hit during rebuilding period is not a concern. My main concern would be a performance hit during normal operation.
I just left the RAID group at 20, so I have 4 spares. Once we buy another shelf I will set it up with another 20 disk RAID group and an 8 disk RAID.
Thanks for the insight. I wish we would of left the RAID group at 16 when we setup the storage originally, but we were trying to maximize our space.
Just to address your last plan - I advise against to just adding whatever disks you have left over to a partial raid group, especially one that is so much smaller than the other raid groups you have going. No matter how you choose your raid group size - the general 16 or something larger to maximize capacity, you still as a best practice want full raid groups and leave the rest as spares.
Why? Performance. If you measure and track at a detail level, you will see unbalanced total aggregate performance from time to time. As you approach "full" capacity, say between 70-80% of useable, measure system performance hiccups will become more pronounced in both read and write. The source will be the small raid group.
Recall that WAFL by design works with full RAID stripes whenever possible. A full RAID stripe on the 8 disk raid group is smaller than that on the 20 disk raid groups. Thus for every user read or write needed the chance that multiple I/Os to the 8 disk raid group are needed to supply the data increases, especially as the aggregate fills. If you drive to the mathematical limits, an aggregate ultimately cannot perform faster than it's slowest raid group in terms of throughput. Given that all the disks are the same, of course, the IOPS per disk and, by extension, the raid groups, are essentially the same in the average. But the throughput bandwidth isn't. 1 IOP of a given size to the 20 disk raid groups can process more than double 1 IOP to the 8 disk raid group in the same average elapsed time. Thus at scale you handicap your system's potential performance.
The default and best practice raid group size of 16 is used because other affects of size start to affect total performance as the rg's get bigger, otherwise of course it would make sense to just use the biggest rg you can. But because of the long term potential effects it is very important to deploy disks in full raid groups and not short change any single raid group just to be using the extra disks. Consider - you're buying some nice performance disk here so obviously performance matters.
Agreed that you want to use everything you can, but to artifically limit the key element you paid for - performance - just to squeeze out a few extra TBs? Would you consider adding an extra half shelf of disk to round out to the rg size instead? An extra 12 disks would match the rg 20 size nicely in your layout. The sales pitch part goes this way: you license cDot by the size of your disks. You pay more per GB of "performance" class disk than "capacity" class disk - list price roughly 3 times more per unit of storage. Why would you pay 3 times more to license cDot for your high performance disk and then build out a storage structure that could over time limit performance to 1/3 of potential? Adding an additional half shelf now would increase cost by about 1/5 if I've done the math right but give you full performance.
Or you can hold off on those 8 drives, and expand by only a half shelf for the next expansion. Or perhaps it will be time to start a new aggregate with a different rg size with the next purchase. It can be difficult to balance the shelf purchase increments against the aggregate layout.
I hope these thoughts help you in your plans.
Lead Storage Engineer
Huron Legal | Huron Consulting Group
NCDA, NCIE - SAN Clustered, Data Protection
Kudos and accepted solutions are always appreciated.
Thanks for the insightful reply. It answered all the questions I had around raid groups and performance issues. As a result, I will definately stick with full raid groups. If I need to use the left over disks then I can create a smaller raid group and a new aggregate.