Active IQ Unified Manager Discussions

Harvest Poller stops running after extended period of time with connection issues

acjackson

Hi,

 

I have an issue with two harvest pollers, that stop working after occasional connection/network issues , that last longer than ~4 hours

 

There are log entries like

 

[WARNING] [nfsv3] update of data cache failed with reason: Server returned HTTP Error: 408 Request

and 

[WARNING] [path] update of data cache failed with reason: in Zapi::invoke, cannot connect to socket

 

 Shouldn't the pollers run forever? Or is there any timeout? I only get log entries for about 4 hours after the first error, then there are no more entries, until I restart the poller.

Then the pollers work until the next extended period of time with connection issues.

1 ACCEPTED SOLUTION

madden

Hi @acjackson

 

The design is for Harvest to try forever.  But, there are some other modules it uses (SSL, NetApp SDK to name a few) that may consider some situations fatal.  If I knew the place it's failing I could potentially wrap this to prevent it but I'm inclined to look for a solution outside of Harvest.

 

One solution could be to use supervisord to [re]start each of your harvest pollers with a config variable of autorestart=1 (here and here).  

 

Another solution that is simpler if you are OK with missing soa few minutes of data is to just add a cron entry to run "/opt/netapp-harvest/netapp-manager -start" every 10 minutes.  This script just parses the netapp-harvest.conf file, runs ps, and then starts pollers that are not already running.

 

I know these are workarounds but I think they are the best options for you.  Hope it helps!

 

Cheers,
Chris Madden

Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!

 

 

View solution in original post

2 REPLIES 2

madden

Hi @acjackson

 

The design is for Harvest to try forever.  But, there are some other modules it uses (SSL, NetApp SDK to name a few) that may consider some situations fatal.  If I knew the place it's failing I could potentially wrap this to prevent it but I'm inclined to look for a solution outside of Harvest.

 

One solution could be to use supervisord to [re]start each of your harvest pollers with a config variable of autorestart=1 (here and here).  

 

Another solution that is simpler if you are OK with missing soa few minutes of data is to just add a cron entry to run "/opt/netapp-harvest/netapp-manager -start" every 10 minutes.  This script just parses the netapp-harvest.conf file, runs ps, and then starts pollers that are not already running.

 

I know these are workarounds but I think they are the best options for you.  Hope it helps!

 

Cheers,
Chris Madden

Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!

 

 

View solution in original post

acjackson

Thx, I have setup a cron job and it seems to work. Smiley Happy

 

I have a logfile with debugging activated, that I could PM to you, maybe it helps you to find the issue.

 

 

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public