I am interested to hear what people do for the RAID config for vol0/aggr0. I have always left it as a 3 disk RAID-DP aggregate with vol0 on it. Now disk sizes have got so big, is it a normal and recommended practice to use RAID4 for aggr0 so as not to lose another disk, as with 3 disks in the aggregate you dont really benefit from the "lots of spindles in the aggregate" idea. I know DP protects against 2 disk failures in the aggregate, but its a decision against the likelihood of this happening, against losting potentially another 1 (or even 2) Tb disk.
We have some in the field war stories where not having a separate root aggregate takes longer to run wafliron/wafl_check... for larger installs, we default to a separate 3 drive root aggregate so that we don't have to wait for a long process to get the system back up... it doesn't help the time to fix the larger aggregate but does let us bring the system up sooner with support (the last one where this happened, the gsc recommended a separate root aggr). This is a rare edge case but one we have seen more than once. I did use 2 drive raid4 root aggregates for a while, but then background disk firmware update requires raid_dp. So we have to force the disk_fw_update to each drive and it can affect ndu upgrades if the system tries to update firmware on reboot since it didn't run in the background with the system up. On smaller systems, FAS2000, etc. or low number of shelves on any system, I don't hesitate to mix the root volume in one large aggregate.
If people want to design for a less than 1% scenario, feel free. I find very few customers who want to do this.
As far as disk_fw_updates, assuming you've got a spare around, you can always temporarily convert it to RAID-DP, do your background update, then convert it back to RAID-4. Both of those conversions can be done on the fly.
Even though rare, we have been bit enough times to keep the separate aggregate...but only if a large enough system where 3 drives don't make a huge difference. This has been a huge debate for some time and I have gone back and forth depending on the customer requirements.
We have recommended that customers change from raid4 to raid_dp then back to raid4 (a good idea and workaround), but they often don't want to go through this every time to wait for rebuild, then zero the spare drive after dropping back to raid4 so they don't have to wait to zero if the drive is used... the only time we ensure the process is done is when we do the PS onsite for the upgrade....but for the cost of a drive, the PS costs more than the single disk as long as enough room in the system/shelves to have the extra drive or 3 drives.
At other customers, as long as they have 2 aggregates, we have had them automate ndmpcopy /root/etc from one aggr to another aggr volume then if any issues they can aggr options root the other aggregate from maintenance mode and get the system back up...it would be even more rare to have to wafl_check 2 aggregates. With enough disks to need at least 2 aggregates this should be as effective or even more resilient...as long as the /etc copy is done regularly.
Are there any updated best practices on this? I heard some of the wafl tools are getting rewritten for 64 bit aggregates to allow for bigger limits...that might make this a non-issue if it can bring up/recover things quicker.
The war stories I can count on one hand but the full hand and can't mention names...but after a few times plus running wafl check would have been faster or could have been done online with wafl iron if the root aggregate were able to come up separate... most were on sata only systems with firmware bugs (which is why we always push for the latest at-fcx firmware)... it likely is 1% or even less of an edge case like Adam said.
The ndmpcopy method works well... keep another volume on another aggregate... but use -f to force overwrite files..or just vol copy would work too...but the thing is getting it automated and checking it is up to date so you can mark the backup root volume as root. One customer made /etc a qtree and used qsm to another volume...but issue there is making it writable..the workaround you could assign another aggregate as root which will automaticaly create "AUTOROOT" then you could break the etc qsm then mark it as root and reboot. This may be a more managable method using qsm.
I also typically rename vol0 to root but that is more ocd than anything else possibly
Several SEs (a group of us and growing) have made a wish list to have two cf cards or two small ssds in the controller and not have root on disk...but not as easy as it sounds probably.
Another interesting thing that we will be seeing more of soon is 64-bit aggregate systems on 8.x.. The root aggregate needs to be 32-bit... I haven't tested putting root on 64-bit but saw it is not supported in one of the docs. If we have a system where we want all 64-bit aggrs, we'll need a separate 32-bit root aggregate for root.
It actually gives a balanced view on separate vs. non-separate aggregate for root volume:
- For small storage systems where cost concerns outweigh resiliency, a FlexVol based root volume on a regular aggregate might be more appropriate.
- FlexVol recovery commands work at the aggregate level, so all of the aggregate's disks are targeted by the operation. One way to mitigate this effect is to use a smaller aggregate with only a few disks to house the FlexVol volume containing the root volume.
This has quickly become a serious issue in regards to the FAS6080's that we have. We are seeing the dreaded java.lang error in FilerView forcing reboots of the filer to clear the space. We are going to have to end up making another small aggr and moving vol0 to it so its not overwhelmed with I/O for other disks. Personally I am with you all, what is wrong with a couple CF cards or some nice SSD's and some good old fashioned RAID1? If anything using that as a primary and using the ndmpcopy (good call Scott!) to have a backup should it hit the fan. Come on NetApp help us out here.