Active IQ Unified Manager Discussions
Active IQ Unified Manager Discussions
Hey!
First of all let me thank you for introducing harvest - I think I'm going to love it, although I've never heard of any of it (Harvest, Grafana ...) before this week. I've run into an issue though.
In the "7-node Disk and Chache Layers" Dashboard I cannot see all aggregates of certain nodes. It works with some, not with others. Let me explain:
Our NearStore has three aggreagtes, aggr0, aggr1 and agg2. All three show up just fine and I can select them.
NodeA and NodeB of our MetroCluster each have two aggregates (aggr0 and aggr_sas and aggr_sata respectively). I can select aggr0 in the dasboard - but not the other aggregates, it shows only aggr0 in the dropdown box.
However - in the diagrams in the dashboard the data is actually present for all aggregates (I have two graphs). Also I've checked in Graphite, the metrics are there.
All three filers run the same Ontap btw (8.2.4P1).
Any idea what might be the cause or how I could go about diagnose this further? The only difference I can see is the underscore in the name of the aggregate and that the nodes are joined in a MetroCluster.
Regards,
Christian
*EDIT*
You know this ... as soon as you hit the button you have an idea. I realized I should check the dashboard jsons and did so - turns out the problem is the fact that the aggregate name does not contain a numeric character. If I understand correctly the data is filtered with a regex and [0-9] includes only what has a number in it. So any aggregate name without a number in it is not shown.
As a "quick fix" I changed the regex to something that matches our shop (aggr_*) and I can now select it just fine.
Solved! See The Solution
Hi @CHMOELLER
You hit it exactly! If you look in graphite you will see this:
This is how I implemented roll-ups to track the max of each child level is tracked. So if you look at aggr0\disk_busy it is the busiest disk in the aggr, while aggr0\plex0\disk_busy is the busiest disk in plex0, and same for rgs. It can be interesting to see differences between raidgroups (especially in case of adding rgs to help inform you if a reallocation would help to get the new disks busier) or between plexes (in case of MC).
Anyway, back to Grafana templates. To populate the list they use the Graphite query API which returns all items at a level; subdirs and metrics are in that response. Since I don't want the metric names in the dropdown I used the regex feature and an assumption (which can be wrong!) that everyone will have a digit in their aggr names. It's a hack but the best one I could think of given the behavior of Graphite and Grafana but if someone can suggest a better way 'm all ears!
For your situation you have adapted the template correctly. If you update to newer dashboards (I am busy on some updates to work better with Grafana 3.0) you will need to apply your edits again. Or you could rename your aggrs 🙂
Good luck!
P.S. Today Grafana 2.5 dashboards here are the best thing going and fix one error you might still see on some dashboards.
Cheers,
Chris Madden
Storage Architect, NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!
Hi @CHMOELLER
You hit it exactly! If you look in graphite you will see this:
This is how I implemented roll-ups to track the max of each child level is tracked. So if you look at aggr0\disk_busy it is the busiest disk in the aggr, while aggr0\plex0\disk_busy is the busiest disk in plex0, and same for rgs. It can be interesting to see differences between raidgroups (especially in case of adding rgs to help inform you if a reallocation would help to get the new disks busier) or between plexes (in case of MC).
Anyway, back to Grafana templates. To populate the list they use the Graphite query API which returns all items at a level; subdirs and metrics are in that response. Since I don't want the metric names in the dropdown I used the regex feature and an assumption (which can be wrong!) that everyone will have a digit in their aggr names. It's a hack but the best one I could think of given the behavior of Graphite and Grafana but if someone can suggest a better way 'm all ears!
For your situation you have adapted the template correctly. If you update to newer dashboards (I am busy on some updates to work better with Grafana 3.0) you will need to apply your edits again. Or you could rename your aggrs 🙂
Good luck!
P.S. Today Grafana 2.5 dashboards here are the best thing going and fix one error you might still see on some dashboards.
Cheers,
Chris Madden
Storage Architect, NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!
Chris,
thanks for going into the details on this, very informative!
I can see why there's a "hack" required ... or at least I cannot see a nicer way of doing this either. Except maybe for adding prefixes to the metrics for cases like this so you have a term to filter for.
Which would of course in turn invalidate "using known terms only" ... and break compatibility to data previoulsy collected. Bummer...
Anyways. Looking forward to your 3.0 adaptation. Gave it a quick try earlier, some of the dashboard templates really didn't seem to like it much.
And thanks again for Harvest ... this approach beats (at least in my very humble opion) every other (NetApp) tool I have so far used to visualize what's going on!
Regards
Chris
>>Except maybe for adding prefixes to the metrics for cases like this so you have a term to filter for.
Regex allows you to filter IN but not to filter OUT. So no solution there either!
I wish Grafana would had an exclude regex option. It's been asked here (https://github.com/grafana/grafana/issues/4000) but no work by anyone yet and I don't have the JavaScript skills. But, maybe someone reading this does and can submit a pull request? One can always hope 🙂
Glad you like Harvest and good luck with it!
Cheers,
Chris Madden
Storage Architect, NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!