Active IQ Unified Manager Discussions

Harvest and histogram counters

markweber
7,491 Views

does anyone know if Harvest supports histogram counters (like volume:{instance}:cifs_protocol_read_latency or cifs:node:{instance}:cifs_latency_hist)?

 

thanks

mark

1 ACCEPTED SOLUTION

madden
7,462 Views

Hi @markweber,

 

Harvest understands CM metdata and can post metrics for any CM counter type including 2D and 3D arrays.  So the CIFS histogram one you mentioned should work fine.  If you add them to a custom template that is already using a plugin to do some summarization you do need to verify that post logic is compatible with your newly added counters.  For cdot-cifs it seems the plugin only replaces slashes needed for a counter named Open/Close so no issue there, but for others like volume if you want a roll-up vol_summary metrics created you might have to adjust.

 

So for 8.3 you could update to add the one in red:

 

	'cifs:node' =>
			{ 
				counter_list     => [ qw(instance_name instance_uuid
									cifs_op_count
									cifs_ops cifs_read_ops cifs_write_ops
									cifs_latency cifs_read_latency cifs_write_latency
									connections established_sessions open_files signed_sessions
cifs_latency_hist ) ], graphite_leaf => 'node.{instance_name}.cifs', plugin => 'cdot-cifs', enabled => '1' },

Then in your metrics tree you will see a new ones like:

node.{nodename}.cifs.cifs_latency_hist.<20us

node.{nodename}.cifs.cifs_latency_hist.<40us

...

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO

View solution in original post

7 REPLIES 7

madden
7,463 Views

Hi @markweber,

 

Harvest understands CM metdata and can post metrics for any CM counter type including 2D and 3D arrays.  So the CIFS histogram one you mentioned should work fine.  If you add them to a custom template that is already using a plugin to do some summarization you do need to verify that post logic is compatible with your newly added counters.  For cdot-cifs it seems the plugin only replaces slashes needed for a counter named Open/Close so no issue there, but for others like volume if you want a roll-up vol_summary metrics created you might have to adjust.

 

So for 8.3 you could update to add the one in red:

 

	'cifs:node' =>
			{ 
				counter_list     => [ qw(instance_name instance_uuid
									cifs_op_count
									cifs_ops cifs_read_ops cifs_write_ops
									cifs_latency cifs_read_latency cifs_write_latency
									connections established_sessions open_files signed_sessions
cifs_latency_hist ) ], graphite_leaf => 'node.{instance_name}.cifs', plugin => 'cdot-cifs', enabled => '1' },

Then in your metrics tree you will see a new ones like:

node.{nodename}.cifs.cifs_latency_hist.<20us

node.{nodename}.cifs.cifs_latency_hist.<40us

...

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO

markweber
7,454 Views

Awesome - thanks Chris!

markweber
7,443 Views

as a followup question -

is it possible to use the override hash to change/map the label values?

something like:

'cifs:node' =>
                                {
                                        cifs_latency_hist => { labels => {20us,40us,60us,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,100ms,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf}}
                                },

 

Grafana seems to have issues with <,>,= in the metric names and using a plugin regex seems impractical (at least for my weak regex skills)

 

thanks

m

 

markweber
7,402 Views

after a little bit of trial and error, this seems to work:

 

'cifs:node' =>
                                {
                                        cifs_latency_hist => { 'label-info' => '20us,40us,60us,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,100ms,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                },

 

madden
7,395 Views

Hi @markweber

 

Actually that label syntax is not something I've designed in so I would be surprised if it works.  

 

I was working with another customer today and verified this:

 

	'volume:node' =>
			{ 
				counter_list     => [ qw(instance_name
total_protocol_read_latency
total_protocol_write_latency
									)],
				graphite_leaf    => 'node.{instance_name}.vol_summary',				
				enabled          => '1'
			},

results in display in Graphite as this:

special_symbols.png

 

So I think your original syntax was the correct one.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO

 

 

markweber
7,333 Views

Hi Chris -

 

Both work great - i kinda switch contexts on you for the second question.

I added histogram counters as below:

 

        'nfsv3:node' =>
                        { 
                                counter_list     => [ qw(instance_name instance_uuid
                                                                        nfsv3_latency_hist read_latency_hist write_latency_hist
                                                                        ) ],
                                graphite_leaf    => 'node.{instance_name}.nfsv3',
                                enabled          => '1'
                        },
        'cifs:node' =>
                        { 
                                counter_list     => [ qw(instance_name instance_uuid
                                                                        cifs_latency_hist cifs_read_latency_hist cifs_write_latency_hist
                                                                        ) ],
                                graphite_leaf    => 'node.{instance_name}.cifs',
                                plugin           => 'cdot-cifs',
                                enabled          => '1'
                        },

and that worked perfectly, but Grafana had trouble with the special characters, so i added the following to the override definitions

 

 

%override = (
                                'cifs:node' =>
                                {
                                        cifs_latency_hist        => { 'label-info' => '20us,40us,60u,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,10
0ms,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                        cifs_read_latency_hist   => { 'label-info' => '20us,40us,60u,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,10
0ms,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                        cifs_write_latency_hist  => { 'label-info' => '20us,40us,60u,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,10
0ms,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                },

 

                                'nfsv3:node' =>
                                {
                                        nfsv3_latency_hist     => { 'label-info' => '20us,40us,60u,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,100m
s,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                        read_latency_hist      => { 'label-info' => '20us,40us,60u,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,100m
s,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                        write_latency_hist     => { 'label-info' => '20us,40us,60u,80us,100us,200us,400us,600us,800us,1ms,2ms,4ms,6ms,8ms,10ms,12ms,14ms,16ms,18ms,20ms,40ms,60ms,80ms,100m
s,200ms,400ms,600ms,800ms,1s,2s,4s,6s,8s,10s,20s,30s,60s,90s,120s,inf' },
                                },

which renamed the metrics from '>20us' to '20us', etc

 

i was able to get a dashboard built that i'm fairly happy with - 

dashboard json is here if anyone is interested: https://gist.github.com/mrkwbr/bd69eb8dfba429dd2a67#file-netapp-detail-node-latency-histogram-json

 

thanks again for your help

mark

 

histogram.jpg

 

madden
7,323 Views

Hi @markweber

 

Great job to extend Harvest for your needs, this is exactly what I hoped people would do!

 

Indeed, I didn't understand your label mapping table but now I get it that you added that snippet to the overrides table and not the main poller hash.  Maybe I should edit Harvest to detect labels that include <>= and replace them with gt, lt, and eq.  Thoughts?  Because Graphite allowed them (I think = as well but not positive) it hasn't come up before.  But if Grafana has trouble them maybe it is something to modify across the board.

 

Cheers,
Chris

Public