Creating a dedicated root aggregate will not create performance problems for you. It will actually improve your daily performance a little bit. At the scale of your filers we are talking fractions of percentage points here, but the improvement is there by not combining the various root volume I/O load with data only aggregates. This effect is most pronounced as both capacity of the disks and number of disks increases, and especially pronounced if you start using cDot instead of 7-mode at any point. In fact, on cDot a dedicated root aggregate for each node is a requirement instead of an option, and while in those cases a three disk root aggregate is sufficient, on high capacity disk systems I actually use 5 disks for my root which makes a noticable performance difference - but I digress a little.
The biggest reason for a dedicated root aggregate on 7-mode systems, in my mind, is operational. Consider the case when you have a data/operational issue that causes corruption on an aggregate. The more stuff you throw at an aggregate - data loads, dedupe, snapshots, whatever - the greater the footprint that exists that you will encounter a bug or issue that causes corruption. That's just simple math - more disks, more load, more features, more potential for a problem. And despite NetApp's (or anyone's) best efforts, such bugs do creep into the mix from time to time, especially as releases evolve. We've all seen them at some point.
So - you make the biggest single aggregate you can, then you put your "boot" volume on that aggregate. Something in your system/operation/data mix causes corruption in the aggregate. You go to restart your Filer - doesn't boot because the aggregate won't come online. Go to failover to the other head - can't, because the aggregate won't come online. Now you are in a lot of manual maintenance mode support time operations to try and do something, anything to recover with more limited access to system logs and histories to what when wrong.
Now - suppose you have 90 odd disks in data aggregates and 3 in a root aggregate. The root aggregate contains nothing more than the root volume. No sharing off the root, no space efficiency, no special features other than a few snapshots for protection of recent changes. If you were to have an issue, which aggregate do you think it would hit? Now, granted, if you only have the root and one other aggregate, if you had a problem that hit the big aggregate you're still pretty much offline. But, now you can still boot/failover. You have full and easy access to logs to diagnose the issue. You can get to any operational mode so the entire set of corrective actions that might be needed are available to you, even if the same amount of data is offline. You can push out firmware or other fixes if needed. And if you ever had more than one aggregate per controller, the other aggregates and their associated volumes are still or at least could be online while you work on the failed one. Additionally, if you ever have an issue with the root aggregate - you have a few spares that you can use to regenerate a completely new root aggregate and volume without worrying about any of the other data volumes at all. Granted you could generate a new root with a single aggregate as well, but it's just cleaner to have the separation.
Some will argue that the space "wasted" by that three disk dedicated root is too high a cost - for example, consider a controller where all the disks might be 4TB or more. That's 12TB physical capacity when you only need 200-300GB or so and it's expensive so why waste it? I counter with the argument that if it was important enough to purchase high end enterprise class storage to begin with, then the nickel and dime approach to a couple of disks is questionable. On one of NetApp's smaller systems I can see the concern, and NetApp has addressed that in current DoT, but when you are in the 200 disk range the argument loses validity in my opinion.
As for eeking out that last bit of space, I am for doing whatever works for you. For example, on my systems I plan for NL-SAS/MSATA high capacity disks at a raid group size of 20 (max for the type). The storage is designed to just house big data, so that maximizes useable space in the system. As I expand shelves I'll hold extra spares as needed so I can build out full size raid groups on the next expansion - because there always is one. For performance disks, I'll go higher than "standard" in raid group sizes as well.
There is a trade off for larger raid group sizes. First, the larger the raid group size, the longer the disk rebuild time when a disk fails. Also, at a certain point, larger raid groups sizes do impact regular operational performance. NetApp achieves best performance by reading and writing full raid stripes across an entire raid group. At a certain RG size performance will hit an inflection point and start to decrease. The inflection point varies based on controller model, disk type, and load patterns. The only way to find the performance optimal RG size for your configuration is to actually test it out yourself, which is rather difficult for most people to do of course. Also, fewer raid groups can affect performance as well as the raid groups get busier as load increases.
For 96 disks per controller, the recomendation was 3 for Aggr0, 5x18 Aggr1, with 3 spare. That config yields 80 data disks (5*16 data disks per raid group). That's a really good design for your system in my opinion. Another way to get 80 data disks is to go with 4 raid groups of size 22 each. That yields 80 data disks (4*20 per rg), uses 88 disks for Aggr1, still has 3 disks for Aggr0, and leaves 5 spare disks. Same space, so what would be the differences?
Well - fewer raid groups means potentially lower performance at the upper bounds due to contention for one less raid group. Larger raid group size means increased disk rebuild time and more affect on performance within the raid group overall due to the rebuild. With more disks, the statistical percentage chance of multiple failures impacting a single raid group is higher , though we are talking in the fractions here of course.
So why might you use a larger raid group now that doesn't have any real advantages now? How about your next disk purchase? If you have a rough idea of how many shelves/disks you might expand this system with the next time you buy storage, which raid group size makes more sense when you make that expansion? On of the two numbers (18/22) might work better going forward. Then again, from what you describe and given the age of the system there is likely not going to be a storage expansion on this box, in which case rg size 18 makes a lot of sense for now.
If you are dead set on maximum space, raid group size of 23 (4 raid groups, 84 data disks, 8 parity disks, 4 spares) will get you that but requires a single aggregate which runs right up against both the potential aggregate and large raid group size concerns listed above. I'd stay with the 5x18 myself, with the dedicated root aggregate.
You are at the limits of what you will get out of this particular system without adding physical space. I hope that this post helps to explain some of the design considerations that go into a good NetApp system design.