Here is where I am at. My company will not purchase any software to monitor the filer. I can get some stats from snmp, even more from "stats".
However, I am not happy with the data I am able to gather for disk stats, specifically total ops per disk, total ops per raid group, and disk utilization per disk/rg/aggr/volume.
When using sysstat, it is my understanding that the disk utilization displayed is that of this highest disk. This is not useful considering there are multiple raid groups and aggregates.
So trying to use 'stats show disk:*:disk_busy" allows me to get per disk data and do whatever calculations I want, but unfortunately it uses UID as the instance, not disk id.
I can not find a way to use the information from "stats show disk" to determine what aggregate or raid group, or volume the disk belongs to. More so, I can not determine what type of disk it is, SAS, SATA, etc.
The big picture is that I want to gather stats to trend disk i/o performance.
Well ... if you are experienced programmer you may find it easier to use Data ONTAP SDK with Perl/Java/PowerShell/C# (and I guess there are are a couple more bindings). This at least avoids need to rely on uncommitted text representation.
Performance counters are also collected by filer as hourly stats and per-second performance archives. They are just XML files after all and so processing could be automated as well. Format is not (publicly) documented, but can be guessed with enough motivation
Should be in /etc/log/stats/archive on current versions. But I apologize, I was wrong. While some part of file is indeed XML, actual data is binary. The hour stats are in /etc/log/cm_stats_hourly and are pure text.