I have Graphite working pretty well except for a few things which I'm still trying to fix. this is something I've really been wanting to see from our systems for a while.
The main thing is that my graphs will only display data for max of 24 hour period even if I zoom out more (eg screenshot below is showing a 48hr period). I copied all the configs as per your instructions from the Quick Start guide but when I zoom out I don't see more data.
I'm not sure if the data is actually being collected and just not being displayed or if I don't have the data stored/collected.
Where could I start looking to get more of an idea?
Glad you like it so far and I think I know the problem.
Graphite's storage-schemas.conf file controls the frequency and retention of stored metrics. That file can have many entries and each entry has a regex expression that is compared against the incoming metrics string. The file is processed in order and the first regex that matches will cause the metrics file to be created with those retentions. So having correct entries, in the correct order (especially not having a 'catch all' as the first one), is critical.
As you can see, this is 1min samples for 1 day retention. Maybe you forgot to edit the file and add the strings from the Harvest install guide 1.2.2 section 7.1? Or, maybe you pasted them at the end of the file and not in front of the default catch-all entry?
If this is the case, fix up the file and then future metrics will be created with the correct settings which look like this:
You were right, I had the [default_1min_for_1day] section at the top of the storage-schemas.conf file still. I removed it and added the entry from that other site (that had a year retention - just going to use that for now) to the bottom of the file after the [netapp.*] sections
I removed the old whisper/netapp folder and then restarted the carbon-cache agent. So that should hopefully do it.
I'll check on it tomorrow (is 10pm here now so I better call it a day) to see if things are graphing as I expect. I'll probably need to add some more disk to my server too as the whisper/netapp folder seem to be growing decently; this was only a test install anyway so I can do that easily.
I have a few other small things I'm struggling with, I'm assuming posting to the discussions here is the best way to ask (and hopefully) get them resolved.
Oh ... ah yes - Graphite is fantastic, this has been something I've been looking for since we put in CDOTA and haven't found anything that could do it like I wanted. Now is just a matter of interpreting the results to sort out some of out storage "issues".
You can always check if the files are the retention you want by looking in the create log (/opt/graphite/storage/log/carbon-cache/carbon-cache-a/creates.log), or you can use the whisper utility whisper-info.py, which should be installed with Graphite.
So in that you see each 'archive' including how many seconds per point and number of points to save. You also see immediately how much space each is consuming.
For your space utilization the big space grab occurs during initial discovery of everything since the files are populated out for their full filesize. If you have it on NetApp storage though the zero's it fills at create time are detected and you only consume storage on the array as it's actually filled with real metric data. Note you do need dedupe enabled on the vol (but no necessarily a scheduled job) for zero detection to work.
Good luck and indeed if you have more questions post 'em in the communities!
Cheers, Chris Madden
Storage Architect, NetApp EMEA (and author of Harvest)
I had the same issue. Looking back at the install doc it doesn't say to remove the default retention of 1d. Although I change the config file based on section 7 of the install document everything was getting caught by the default retention. This post help me figure it out. Thanks for the help, NetApp Harvest is great!
After making the changes you suggested, I think I may have accidentally messed up something with the configuration. Harvest is no longer collecting any day since I made the changes. See the screenshot below.
If data isn't being displayed either there is a problem collecting and sending it (Harvest), receiving and storing it (Graphite), or displaying it (Grafana). My guess is it's one of the 1st two.
I would check the logfiles from Harvest and Graphite for more. The logfiles for Harvest are in /opt/netapp-harvest/log/<poller>.log, and the ones for Graphite carbon vary a bit depending on the OS and installation method used, but I have these doc'd the locations in the Graphite and Grafana Quick start guide.
I'd also make sure Harvest is running, because that could be the most basic reason you have no data (/opt/netapp-harvest/netapp-manager -status, and then with -start option to start them if not running).
If this isn't enough please open a new communities thread with the errors you find in the logfiles.
Hi Chris, i maded the change to the database and it stopped displaying my graphs in Grafana, I have checked all poller for errors and even removed the .wsp. I resrtated server and conform i had no errors, and still nothing. Here is what I did exactly. I ran the set of commands you illustrated below.
After that it stopped reporting to grafana but is definitely pulling data. not sure how to open a different thread as you mentioned to someone in this thread, but we both definitely got he sameoutcome.