Effective December 3, NetApp adopts Microsoft’s Business-to-Customer (B2C) identity management to simplify and provide secure access to NetApp resources.
For accounts that did not pre-register (prior to Dec 3), access to your NetApp data may take up to 1 hour as your legacy NSS ID is synchronized to the new B2C identity.
To learn more, read the FAQ and watch the video.
Need assistance? Complete this form and select “Registration Issue” as the Feedback Category.

Active IQ Unified Manager Discussions

High Node Latencies Netapp Harvest 1.4.1 since ONTAP 9.4

jtoaspern

Hello,


After upgrading our clusters from ONTAP 9.3P4 to 9.4P3 last week, the node latency data from netapp-harvest in Grafana seems to be unrealistically high (NetApp Dashboard: Cluster -> Highlights -> Latency).

We are using netapp-harvest 1.4.1 without the hotfix, as the link seems to be expired:

https://community.netapp.com/t5/OnCommand-Storage-Management-Software-Discussions/NetApp-Harvest-1-4-1-Hotfix-to-fix-2-bugs/m-p/144160#M26247

We copied cdot-9.3.0.conf to cdot-9.4.0.conf and restarted netapp-harvest's pollers.

 

On the ONTAP CLI the Latencies of all nodes are constantly within the 100-1500 us range (statistics node show -interval 5 -iterations 50 -max 4), which is much lower than the latencies reported by netapp-harvest.

 

In the attached picture the Latency increase since the upgrade to 9.4 is clearly visible.

 

How are these latencies calculated?

 

There are no entries in the /opt/netapp-harvest/log/*.log file of the cluster, other than NORMAL Poller status messages.


This has been an off-topic discussion in a few other threads, which are marked as solved, which is why I am opening a new one.

 

Edit: Latencies reported by OCUMs graphs are in line with the values seen on the CLI.


Kind Regards

Joel

1 ACCEPTED SOLUTION

jtoaspern

Looks like Netapp-Harvest 1.4.2 fixed the issue, the node latencies shown in Grafana are "realistic" again.

View solution in original post

3 REPLIES 3

suren

Hi Joel,

 

Can you please try to access the link now. Its updated.

 

https://community.netapp.com/t5/OnCommand-Storage-Management-Software-Discussions/NetApp-Harvest-1-4-1-Hotfix-to-fix-2-bugs/m-p/144160#M26247

 

Thanks & Regards,

Surendra

jtoaspern

Hi Surendra,

 

thanks for reuploading the hotfix. I applied it earlier today, but it seems to not have affected the node latency statistics in any way.

 

There is a distinct pattern in the latency spikes, every 5 minutes it goes up (10-30ms), while staying at relatively calm 0-2 ms between those spikes. 0-2ms would be in line with the data seen with "statistics node show -iterations 50". I kept a close eye on the statistics shown on the CLI, no pattern visible there, continuously 0.3-1.5 ms on all nodes.

The Latency graphs under "Dashboard: Node", which show read/write/other for a single node are pretty close to the CLI and OCUM statistics, only occasionally showing dips into the >4ms range.

I attached a screenshot relevant to the matter.

 

The spikes may be an indicator to some other factor not yet considered here.

 

Please note that this does not have any actual negative impact on our production, the behaviour since our 9.3->9.4 update just seems odd.

 

Regards

Joel

jtoaspern

Looks like Netapp-Harvest 1.4.2 fixed the issue, the node latencies shown in Grafana are "realistic" again.

View solution in original post

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public