Just to address your last plan - I advise against to just adding whatever disks you have left over to a partial raid group, especially one that is so much smaller than the other raid groups you have going. No matter how you choose your raid group size - the general 16 or something larger to maximize capacity, you still as a best practice want full raid groups and leave the rest as spares.
Why? Performance. If you measure and track at a detail level, you will see unbalanced total aggregate performance from time to time. As you approach "full" capacity, say between 70-80% of useable, measure system performance hiccups will become more pronounced in both read and write. The source will be the small raid group.
Recall that WAFL by design works with full RAID stripes whenever possible. A full RAID stripe on the 8 disk raid group is smaller than that on the 20 disk raid groups. Thus for every user read or write needed the chance that multiple I/Os to the 8 disk raid group are needed to supply the data increases, especially as the aggregate fills. If you drive to the mathematical limits, an aggregate ultimately cannot perform faster than it's slowest raid group in terms of throughput. Given that all the disks are the same, of course, the IOPS per disk and, by extension, the raid groups, are essentially the same in the average. But the throughput bandwidth isn't. 1 IOP of a given size to the 20 disk raid groups can process more than double 1 IOP to the 8 disk raid group in the same average elapsed time. Thus at scale you handicap your system's potential performance.
The default and best practice raid group size of 16 is used because other affects of size start to affect total performance as the rg's get bigger, otherwise of course it would make sense to just use the biggest rg you can. But because of the long term potential effects it is very important to deploy disks in full raid groups and not short change any single raid group just to be using the extra disks. Consider - you're buying some nice performance disk here so obviously performance matters.
Agreed that you want to use everything you can, but to artifically limit the key element you paid for - performance - just to squeeze out a few extra TBs? Would you consider adding an extra half shelf of disk to round out to the rg size instead? An extra 12 disks would match the rg 20 size nicely in your layout. The sales pitch part goes this way: you license cDot by the size of your disks. You pay more per GB of "performance" class disk than "capacity" class disk - list price roughly 3 times more per unit of storage. Why would you pay 3 times more to license cDot for your high performance disk and then build out a storage structure that could over time limit performance to 1/3 of potential? Adding an additional half shelf now would increase cost by about 1/5 if I've done the math right but give you full performance.
Or you can hold off on those 8 drives, and expand by only a half shelf for the next expansion. Or perhaps it will be time to start a new aggregate with a different rg size with the next purchase. It can be difficult to balance the shelf purchase increments against the aggregate layout.
I hope these thoughts help you in your plans.
Bob Greenwald
Lead Storage Engineer
Huron Legal | Huron Consulting Group
NCDA, NCIE - SAN Clustered, Data Protection
Kudos and accepted solutions are always appreciated.