We caught up on slack yesterday but I will share here too with the community. The root cause is hitting a max memory limit for an aggregation performed by the counter manager subsystem in ONTAP. Some counter objects are aggregated counters, meaning they are summarized from more detailed counters. For example with workload_volume the instances are one for each volume on the cluster. But the cluster is actually keeping track of the metrics per volume per ONTAP node (in the workload_volume:constituent object) because volume access might be include work done by multiple ONTAP nodes. So if you have 500 vols and a 4 node cluster you'd have 2000 instances in workload_volume:constituent, and then ONTAP would aggregate them to 500 in the workload_volume object. To avoid consuming too much memory during aggregation ONTAP will reject especially large requests and is what is happening here.
There are a few options, one is to not request counters in aggregated objects and instead do the aggregation in client code. Another might be to reduce the list of counters to collect for each instance, which decreases memory needed during the aggregation. A third option might be to use the counter manager archiver feature like OATS does. I'll have a look at adapting Harvest to do the aggregation natively for workload_volume but no guarantees it will make it to the top of the priority list :-0
Hope this helps.
Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!