4 weeks ago
I've setup Grafana + Graphite + Netapp-Harvest with SDK 5.7 and all looks to be working fine.
I've discovered that sometimes data is missing in-between. This was discovered first using Graphana (that some lines stopped) in the graph. Then after some more investigation we discovered that this data is not present in Graphite.
Checking NMON for system performance doesn't directly show us any issues with CPU or Memory during those periods so it must be something else.
Is this a common issue? Is this something we can debug?
I've looked in all the logs of graphite and netapp-harvest but I do not see any warnings or errors that could explain or be related to the issues we're seeing.
We've tried several settings in carbon.conf (with restart after changes) eg.
MAX_UPDATES_PER_SECOND was set from 500, to 1000, 10000 and now 50000 but it doesn't seem to make any difference.
Same counts for MAX_CREATES_PER_MINUTE; inital settings 600 but with 1200, 5000 and 10000 (now) nothing has changed.
I checked the logs (eg. console.log) but do not see any error or warning messages which indicate an issue, limit or problem.
In the image attached (I had to partly remove the names because of security reasons) you can see that Read Latency and Write Latency are missing data; while they do have data of the throughput (same volume), there's no data about the latency of that read/write. These are the graphs of Dashboard 'Volume', section 'Top Volume Backend WAFL Layer Drilldown'.