Is it normal to see disk util at near 100% constantly in the 3 disk node root aggregate on only one node of a two node switchless cluster? It dips down occasionally but it stays near 100% for the most part. I'm seeing this on two different 2 node switchless clusters.
Consider - the root node aggregate is where a whole ton of node base operations will occur, from logging to configuration data handling, etc. The permanent copy of a lot of stuff is on the node root aggregate.
And with a three disk aggregate the system puts all that on 1 data disk (2 parity). If you build a configuration that is big enough, uses the right services setup, etc., you by design put load onto the node root aggregate, which is then slaved to the speed of a single spindle. You don't have disk details listed, but if the disks are SATA class, the load is somewhat magnified in that each access tends to be a bit slower. The fact that it is busy near 100% during a measurement interval or steadily across measurement intervals is not unexpected.
There is a "but" in all of this: it's a problem only if the node is not processing other requests fast enough because it is waiting for the root node aggregate. If total system performance is otherwise normal or within acceptable, then don't worry about it. If system performance isn't good enough, Perfstats and other load counters will reveal if the workloads are waiting on something in the "processing" phase of the request which can then drill down to the node root aggregate if appropriate.
On heavily used systems, I have found a small but measureable difference in performance by increasing the node root aggregates to 5 total disks, giving you three data disks to better respond to needed node root aggregate I/O. Not huge, but given that after I switched to 5 node aggregates many things just "felt" better and performance appear to show a small % difference. At a large scale if your have several hundred or a thousand disks in an HA pair, having 10 for node root aggregates isn't a huge deal. Not quite the same calculus if you have maybe 4 shelves across two nodes of course.
I hope this helps you.
Lead Storage Engineer
Huron Legal | Huron Consulting Group
NCDA, NCIE - SAN Clustered, Data Protection
Kudos and accepted solutions are always appreciated.
Hi, just to expand a little on what Bob and Paul are saying in their posts. This is not a problem if it does not affect other services.
However, we are seeing alot of high watermark CPs (type H), meaning that whatever the cluster is doing is maxing out the NVRAM, even to the point we have a Back-to-Back (type B), i.e. the disk just cannot keep up with the writes. The NVRAM serves the whole system and if it fills this affects the entire system performance, not just the root aggregate. Currently this is not an issue since it is not serving any data.
Since there are no data protocol IOPS, I'm assuming you have this in an active/passive configuration. In a failover event would these events back-off sufficiently to serve data without impact, i.e. are they truly background as suggested in the KB article and can be safely ignored?
Has a failover test been performed to measure the impact on service, if any? This will prove if it is an issue or not in your environment.