2017-06-20 05:49 AM
I have an issue with two harvest pollers, that stop working after occasional connection/network issues , that last longer than ~4 hours
There are log entries like
[WARNING] [nfsv3] update of data cache failed with reason: Server returned HTTP Error: 408 Request
[WARNING] [path] update of data cache failed with reason: in Zapi::invoke, cannot connect to socket
Shouldn't the pollers run forever? Or is there any timeout? I only get log entries for about 4 hours after the first error, then there are no more entries, until I restart the poller.
Then the pollers work until the next extended period of time with connection issues.
Solved! SEE THE SOLUTION
2017-06-21 07:17 AM
The design is for Harvest to try forever. But, there are some other modules it uses (SSL, NetApp SDK to name a few) that may consider some situations fatal. If I knew the place it's failing I could potentially wrap this to prevent it but I'm inclined to look for a solution outside of Harvest.
Another solution that is simpler if you are OK with missing soa few minutes of data is to just add a cron entry to run "/opt/netapp-harvest/netapp-manager -start" every 10 minutes. This script just parses the netapp-harvest.conf file, runs ps, and then starts pollers that are not already running.
I know these are workarounds but I think they are the best options for you. Hope it helps!
Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!
2017-06-23 12:24 AM
Thx, I have setup a cron job and it seems to work.
I have a logfile with debugging activated, that I could PM to you, maybe it helps you to find the issue.