Active IQ Unified Manager Discussions
Active IQ Unified Manager Discussions
Hi,
I have an issue with two harvest pollers, that stop working after occasional connection/network issues , that last longer than ~4 hours
There are log entries like
[WARNING] [nfsv3] update of data cache failed with reason: Server returned HTTP Error: 408 Request
and
[WARNING] [path] update of data cache failed with reason: in Zapi::invoke, cannot connect to socket
Shouldn't the pollers run forever? Or is there any timeout? I only get log entries for about 4 hours after the first error, then there are no more entries, until I restart the poller.
Then the pollers work until the next extended period of time with connection issues.
Solved! See The Solution
Hi @acjackson
The design is for Harvest to try forever. But, there are some other modules it uses (SSL, NetApp SDK to name a few) that may consider some situations fatal. If I knew the place it's failing I could potentially wrap this to prevent it but I'm inclined to look for a solution outside of Harvest.
One solution could be to use supervisord to [re]start each of your harvest pollers with a config variable of autorestart=1 (here and here).
Another solution that is simpler if you are OK with missing soa few minutes of data is to just add a cron entry to run "/opt/netapp-harvest/netapp-manager -start" every 10 minutes. This script just parses the netapp-harvest.conf file, runs ps, and then starts pollers that are not already running.
I know these are workarounds but I think they are the best options for you. Hope it helps!
Cheers,
Chris Madden
Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!
Hi @acjackson
The design is for Harvest to try forever. But, there are some other modules it uses (SSL, NetApp SDK to name a few) that may consider some situations fatal. If I knew the place it's failing I could potentially wrap this to prevent it but I'm inclined to look for a solution outside of Harvest.
One solution could be to use supervisord to [re]start each of your harvest pollers with a config variable of autorestart=1 (here and here).
Another solution that is simpler if you are OK with missing soa few minutes of data is to just add a cron entry to run "/opt/netapp-harvest/netapp-manager -start" every 10 minutes. This script just parses the netapp-harvest.conf file, runs ps, and then starts pollers that are not already running.
I know these are workarounds but I think they are the best options for you. Hope it helps!
Cheers,
Chris Madden
Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!
Thx, I have setup a cron job and it seems to work.
I have a logfile with debugging activated, that I could PM to you, maybe it helps you to find the issue.