<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Harvest/Graphite - &amp;quot;spotty&amp;quot; data in Active IQ Unified Manager Discussions</title>
    <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/130788#M23674</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.netapp.com/t5/user/viewprofilepage/user-id/1667"&gt;@lisa5&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I checked the code and the latency_io_reqd applies to any counter that has a property of ‘average’ and has ‘latency’ in the name. &amp;nbsp;If this is true it then checks the base counter and it needs at least latency_io_reqd IOPs to report the latency. &amp;nbsp;Keep in mind that the raw base counter is sometimes a delta value, like # of scans since last poll, so you must divide it by the elapsed time between polls to get the per second rate. &amp;nbsp;Only if this rate is &amp;gt; 10 would latency be submitted. &amp;nbsp;My experience is that very low IOP counters tend to have wacky latency figures that create distracting graphs and is why I added this feature. &amp;nbsp;You can certainly set the latency_io_reqd = 0 to disable entirely. &amp;nbsp;You could also run two pollers per cluster, one for only vscan and the other for the rest, with differing latency_io_reqd values. &amp;nbsp;It could also be that I need to add a hard-coded exception for vscan latency if it is accurate even at very low op levels. &amp;nbsp;When I add vscan support to Harvest I'll check this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this helps!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers,&lt;BR /&gt;Chris Madden&lt;/P&gt;&lt;P&gt;Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)&lt;/P&gt;&lt;P&gt;Blog:&amp;nbsp;&lt;A href="http://www.beginswithdata.com/" target="_blank"&gt;It all begins with data&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;If this post resolved your issue, please help others by selecting&amp;nbsp;&lt;STRONG&gt;ACCEPT AS SOLUTION&lt;/STRONG&gt;&amp;nbsp;or adding a&amp;nbsp;&lt;STRONG&gt;KUDO &lt;/STRONG&gt;or both!&lt;/EM&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 04 May 2017 07:59:06 GMT</pubDate>
    <dc:creator>madden</dc:creator>
    <dc:date>2017-05-04T07:59:06Z</dc:date>
    <item>
      <title>Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118734#M21182</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I browsed through other posts that were related to our issue, but coudn't find anything like what we're running into. &amp;nbsp;Anyway, we're just now dabling with Graphite/Grafana/Harvest and we're pretty happy with the granularity of metrics that we think the tool will provide (similar to what we had with DFM and our 7DOT systems). &amp;nbsp;That said, we're drilling down into some detail metrics and starting to notice that a lot of the data has some strange gaps. &amp;nbsp;A particular volume, as an example, will have read and average latency statistics recorded as a continuous chart, but then the write statistics will only have a data point every half-hour or so (recorded as a "blip" on the chart). &amp;nbsp;We've seen issues like this with other performance monitoring tools either when the node/filer gets too busy (which is very unlikely in the case of the cDOT system we're trying to monitor) and/or the server hosting the tool gets too busy (again, doesn't seem likely here).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Just wondering if anyone has run into something similar and/or if the community has an pointers as to where we should look. &amp;nbsp;I've attached a sample graph of what we're seeing if that helps. &amp;nbsp;The spotty datapoints have been circled (since they can be hard to pick out). &amp;nbsp;Since these two volumes are in the same SVM and the metric category is the same, I wouldn't suspect that they somehow have different sampling rates - and depending on the volume there are times the metrics that are spotty on some will be consistent/continuous on others.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Chris&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;IMG src="https://community.netapp.com/t5/image/serverpage/image-id/5226i11ECE68EB5517D70/image-size/original?v=v2&amp;amp;px=-1" border="0" alt="Capture.JPG" title="Capture.JPG" /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Apr 2016 16:05:19 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118734#M21182</guid>
      <dc:creator>colsen</dc:creator>
      <dc:date>2016-04-26T16:05:19Z</dc:date>
    </item>
    <item>
      <title>Re: Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118762#M21192</link>
      <description>&lt;P&gt;Hello and thank you for bringing this issue up.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;we have a similar situation here - i've watched this on individual svm volume stats in graphite/grafana - especially with write latency ...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the View in graphite ...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;IMG src="https://community.netapp.com/t5/image/serverpage/image-id/5228i5C565B2F85E5BB86/image-size/original?v=v2&amp;amp;px=-1" alt="graphite_svc_volume_spotty_1h.jpg" title="graphite_svc_volume_spotty_1h.jpg" border="0" /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and here in grafana&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;IMG src="https://community.netapp.com/t5/image/serverpage/image-id/5229iB3135A2209DF1DDF/image-size/original?v=v2&amp;amp;px=-1" alt="grafana_svc_volume_spotty_1h.jpg" title="grafana_svc_volume_spotty_1h.jpg" border="0" /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;are these issues due to 'holes' in the datarows or with processing of harvest?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the log of this poller looks pretty clean:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;[2016-04-26 18:21:44] [NORMAL ] Poller status: status, secs=14400, api_time=645, plugin_time=7, metrics=203492, skips=0, fails=0
[2016-04-26 22:21:44] [NORMAL ] Poller status: status, secs=14400, api_time=646, plugin_time=7, metrics=195650, skips=0, fails=0
[2016-04-27 02:21:44] [NORMAL ] Poller status: status, secs=14400, api_time=651, plugin_time=7, metrics=194791, skips=0, fails=0
[2016-04-27 06:21:44] [NORMAL ] Poller status: status, secs=14400, api_time=646, plugin_time=7, metrics=196353, skips=0, fails=0
[2016-04-27 10:21:44] [NORMAL ] Poller status: status, secs=14400, api_time=648, plugin_time=8, metrics=208015, skips=0, fails=0
[2016-04-27 14:21:44] [NORMAL ] Poller status: status, secs=14400, api_time=649, plugin_time=7, metrics=209007, skips=0, fails=0&lt;/PRE&gt;&lt;P&gt;our monitoring interval is 1 minute&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thank you for sharing your Informations:-)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;yours&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Martin Barth&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Apr 2016 12:51:32 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118762#M21192</guid>
      <dc:creator>MARTINBARTH</dc:creator>
      <dc:date>2016-04-27T12:51:32Z</dc:date>
    </item>
    <item>
      <title>Re: Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118787#M21199</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I suspect you have very low IOP volumes and the "latency_io_reqd" squelch is kicking in. &amp;nbsp;From the Harvest admin guide:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;latency_io_reqd&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Latency metrics can be inaccurate if the IOP count is low. This&lt;BR /&gt;parameter sets a minimum number of IOPs required before a&lt;BR /&gt;latency metric will be submitted and can help reduce confusion&lt;BR /&gt;from high latency but low IOP situations. Supported for volumes&lt;BR /&gt;and QoS instances only.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you set this parameter to 0 then latency values&amp;nbsp;will never be supressed - so no spotty data - but you may also see more latency outliers due to low accuracy seen when IOP count is low.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Set to 0, restart the pollers, and share if you think the result is better or worse than the default.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers,&lt;BR /&gt;Chris Madden&lt;/P&gt;&lt;P&gt;Storage Architect, NetApp EMEA (and author of Harvest)&lt;/P&gt;&lt;P&gt;Blog:&amp;nbsp;&lt;A href="http://blog.pkiwi.com/" target="_blank"&gt;It all begins with data&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;If this post resolved your issue, please help others by selecting&amp;nbsp;&lt;STRONG&gt;ACCEPT AS SOLUTION&lt;/STRONG&gt;&amp;nbsp;or adding a&amp;nbsp;&lt;STRONG&gt;KUDO&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Apr 2016 10:10:21 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118787#M21199</guid>
      <dc:creator>madden</dc:creator>
      <dc:date>2016-04-28T10:10:21Z</dc:date>
    </item>
    <item>
      <title>Re: Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118794#M21201</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for the response. &amp;nbsp;I did some spot checking and built some graphs with read/write ops and then read/write latency. &amp;nbsp;Sure enough, if the operation count in question drops below ~5 or so, the corresponding latency number gets spotty. &amp;nbsp;Since my original graph is of our Exchange cluster, it's doing mostly reads with a handful of rights - therefore the read latency is consistent and the write latency metric falls off from time to time. &amp;nbsp;I then did some checking against our Splunk cluster (which is doing 99% writes) and it has consistent write latency numbers but hardly ever gets read_ops high enough to trigger a latency metric.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We'll play around with that latency_io_reqd and see what we get back. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Chris&lt;/P&gt;</description>
      <pubDate>Thu, 28 Apr 2016 12:28:37 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118794#M21201</guid>
      <dc:creator>colsen</dc:creator>
      <dc:date>2016-04-28T12:28:37Z</dc:date>
    </item>
    <item>
      <title>Re: Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118848#M21218</link>
      <description>&lt;P&gt;Hello Chris!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your 'enlightening' answer - your description and hint to the harvest documentation does the trick &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;you're the best - thank you for your excellent work with netapp-harvest - it's a charm watching these graphs &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;yours,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Martin Barth&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 29 Apr 2016 16:50:11 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/118848#M21218</guid>
      <dc:creator>MARTINBARTH</dc:creator>
      <dc:date>2016-04-29T16:50:11Z</dc:date>
    </item>
    <item>
      <title>Re: Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/130786#M23673</link>
      <description>&lt;P&gt;&lt;a href="https://community.netapp.com/t5/user/viewprofilepage/user-id/7599"&gt;@madden&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi Chris&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using Harvest to monitor vscan latency cDOT systems. What I have noticed is that this parameter “latency_io_reqd” does suppress the vscan latency. Harvest guide says the default value is 10. I’m not entirely sure how Harvest code works out this, but in our environment, before I set this to 0, the vscan scan latency is 0 before 8am, even though the scan base number is more than 10 (a few hundreds from time to time), the scan latency is 0 in grafana. After this is set to 0, we can see scan latency being captured completed for all times. So, maybe the default value to ignore low IOPS isn’t 10?&lt;/P&gt;</description>
      <pubDate>Thu, 04 May 2017 05:09:58 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/130786#M23673</guid>
      <dc:creator>lisa5</dc:creator>
      <dc:date>2017-05-04T05:09:58Z</dc:date>
    </item>
    <item>
      <title>Re: Harvest/Graphite - "spotty" data</title>
      <link>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/130788#M23674</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.netapp.com/t5/user/viewprofilepage/user-id/1667"&gt;@lisa5&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I checked the code and the latency_io_reqd applies to any counter that has a property of ‘average’ and has ‘latency’ in the name. &amp;nbsp;If this is true it then checks the base counter and it needs at least latency_io_reqd IOPs to report the latency. &amp;nbsp;Keep in mind that the raw base counter is sometimes a delta value, like # of scans since last poll, so you must divide it by the elapsed time between polls to get the per second rate. &amp;nbsp;Only if this rate is &amp;gt; 10 would latency be submitted. &amp;nbsp;My experience is that very low IOP counters tend to have wacky latency figures that create distracting graphs and is why I added this feature. &amp;nbsp;You can certainly set the latency_io_reqd = 0 to disable entirely. &amp;nbsp;You could also run two pollers per cluster, one for only vscan and the other for the rest, with differing latency_io_reqd values. &amp;nbsp;It could also be that I need to add a hard-coded exception for vscan latency if it is accurate even at very low op levels. &amp;nbsp;When I add vscan support to Harvest I'll check this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this helps!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers,&lt;BR /&gt;Chris Madden&lt;/P&gt;&lt;P&gt;Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)&lt;/P&gt;&lt;P&gt;Blog:&amp;nbsp;&lt;A href="http://www.beginswithdata.com/" target="_blank"&gt;It all begins with data&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;If this post resolved your issue, please help others by selecting&amp;nbsp;&lt;STRONG&gt;ACCEPT AS SOLUTION&lt;/STRONG&gt;&amp;nbsp;or adding a&amp;nbsp;&lt;STRONG&gt;KUDO &lt;/STRONG&gt;or both!&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 May 2017 07:59:06 GMT</pubDate>
      <guid>https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/Harvest-Graphite-quot-spotty-quot-data/m-p/130788#M23674</guid>
      <dc:creator>madden</dc:creator>
      <dc:date>2017-05-04T07:59:06Z</dc:date>
    </item>
  </channel>
</rss>

