Active IQ Unified Manager Discussions

NetApp-Harvest poller not working?

NICKBARTON
8,277 Views

The pollers don't appear to be pulling nay data and I can't really see anything in the logs that is helping. I have run the working in console mode and still no errors that I can but I'm not getting data and it doesn't appear to have updated the logs since I configured it yesterday (shouldn't there be something every 4 hours?). It's a fresh install so I don't have any data, currently trying to determine an issue at the site so any help is greatly appreciated. Some log information below: 

 

All 7Mode systems I am currently trying to monitor to track down a potential bully workload in the environment. 

 

Status shows them NOT RUNNING: 

 

STATUS POLLER SITE
############### #################### ##################
[NOT RUNNING] ntap01 SITE1
[NOT RUNNING] ntap02 SITE1
[NOT RUNNING] ntap03 SITE1
[NOT RUNNING] ntap04 SITE1 

 

nc test for all four hosts passed on port 443

TLS is enabled

 

Here is all that exists in the poller logs: 

 

[2016-01-05 21:35:36] [NORMAL ] WORKER STARTED [Version: 1.2.2] [Conf: netapp-harvest.conf] [Poller: ntap02]
[2016-01-05 21:35:36] [NORMAL ] [main] Poller will monitor a [FILER] at [ntap02.domain.coml:443]
[2016-01-05 21:35:36] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********]
[2016-01-05 21:41:22] [NORMAL ] WORKER STARTED [Version: 1.2.2] [Conf: netapp-harvest.conf] [Poller: ntap02]
[2016-01-05 21:41:22] [NORMAL ] [main] Poller will monitor a [FILER] at [ntap02.domain.coml:443]
[2016-01-05 21:41:22] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********]

 

 

That is the end of the log file. Same for all four hosts configured. What am I missing? The logs don't really seemt to be giving me anything to go on. 

 

Thank You, 

Nick Barton 

1 ACCEPTED SOLUTION

madden
8,208 Views

@NICKBARTON

 

It sounds like you don't have the Net::SSLeay module on your system which is a dependency in the SDK. You must have skipped or had an error when installing the Perl prerequisite modules listed on page 8 of the Harvest installation guide.  Basically do the sudo yum install or sudo apt-get install step again from the guide and verify all are installed without error.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

P.S.  Please select “Options” and then “Accept as Solution” if this response answered your question so that others will find it easily!

View solution in original post

9 REPLIES 9

dlmaldonado
8,270 Views

Did you check that the password you specified for the netapp harvest monitor account you created on your Filer matches what you've specified in the /opt/netapp-harvest/netapp-harvest.conf file?

Also, are you seeing anything in the Filer logs to indicate failed authentication?

NICKBARTON
8,268 Views

Correct. I have checked all that and even attempted to login with the account via SSH to confirm the password was correct (even though it doesn't have rights to do so) it accepted the password and denied the request in the log. Don't see any failures or logs of pulling data at all on the NetApp side. 

madden
8,236 Views

Hi @NICKBARTON

 

It is odd that the poller stays active but doesn't log anything else.  Normally you will see something like this at startup:

 

[2016-01-06 22:27:51] [NORMAL ] WORKER STARTED [Version: 1.2.2NEXT] [Conf: netapp-harvest.conf] [Poller: sdt-7dot1b]
[2016-01-06 22:27:51] [NORMAL ] [main] Poller will monitor a [FILER] at [sdt-7dot1b:443]
[2016-01-06 22:27:51] [NORMAL ] [main] Poller will use [password] authentication with username [root] and password [**********]
[2016-01-06 22:27:52] [NORMAL ] [main] Collection of system info from [sdt-7dot1b] running [NetApp Release 8.2.3P3 7-Mode] successful.
[2016-01-06 22:27:52] [NORMAL ] [main] Using best-fit collection template: [7dot-8.2.0.conf]
[2016-01-06 22:27:52] [NORMAL ] [main] Using graphite_root [netapp.perf7.nl.sdt-7dot1b]
[2016-01-06 22:27:52] [NORMAL ] [main] Using graphite_meta_metrics_root [netapp.poller.perf7.nl.sdt-7dot1b]
[2016-01-06 22:27:52] [NORMAL ] [main] Startup complete. Polling for new data every [60] seconds.

 

So it looks like something is preventing Harvest from contacting the controller to determine the release.  If it was an authorization issue it would be reported with an apporpriate error, and if there was nothing listening we would see a connect error.  My guess is something security related like a firewall or SELinux is disrupting the data flow.  Normally in Harvest there is a timeout of 60s for any API call but for some reason it must not be timing out correctly in your situation (not that if it did timeout we'd see much of an improvement).

 

You can run the poller with the -v flag to see more logging output.  So something like "/opt/netapp-harvest/netapp-worker -poller ntap02 -v" to see some more details.  Also, the logfile may have more details than the STDOUT, so check there too.  I would check system logs and SELinux logs.  Maybe also try running as root (in case you were not) to see if that makes any difference.

 

If you get it working please share what you had to do, or if you have some logs we can continue troubleshooting.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

P.S.  Please select “Options” and then “Accept as Solution” if this response answered your question so that others will find it easily!

James_Castro
8,231 Views

Can you check to make sure you have enough disk space?  I know that you said it was a fresh install but I ran into similar situations when my disk was full.

NICKBARTON
8,224 Views

@madden

 

Yes I thought it was odd to based on what I saw in the troubleshooting section of the doc and what I found out on the internet that the log files were so incomplete. I will dig into the verbose mode and SELinux logs to see if I can get any more information. I tested access to the nodes via the netcat test and just plan telnet so I know the ports themselves are open. Let me see if I can find anymore useful information based on your suggestions. 

 

@James_Castro

 

I do have enough space on the partition currently to start logging. I may need to grow at some point but there is plenty of free space to get this started. 

 

Thank You, 

Nick Barton 

NICKBARTON
8,214 Views

@madden

 

Using the verbose command I do see an error now at the end of the log file:

 

[2016-01-07 16:51:11] [NORMAL ] WORKER STARTED [Version: 1.2.2] [Conf: netapp-harvest.conf] [Poller: ntap03]
[2016-01-07 16:51:11] [WARNING] Started in foreground mode; messages to STDERR are redirected to the logfile and are not visible on the console.
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [19] is Section [global]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [20] in Section [global] has Key/Value pair [grafana_api_key]=):=]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [21] in Section [global] has Key/Value pair [grafana_url]=[https://localhost:443]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [22] in Section [global] has Key/Value pair [grafana_dl_tag]=[]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [28] is Section [default]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [30] in Section [default] has Key/Value pair [graphite_enabled]=[1]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [31] in Section [default] has Key/Value pair [graphite_server]=[localhost]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [32] in Section [default] has Key/Value pair [graphite_port]=[2003]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [33] in Section [default] has Key/Value pair [graphite_proto]=[tcp]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [34] in Section [default] has Key/Value pair [normalized_xfer]=[mb_per_sec]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [35] in Section [default] has Key/Value pair [normalized_time]=[millisec]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [36] in Section [default] has Key/Value pair [graphite_root]=[default]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [37] in Section [default] has Key/Value pair [graphite_meta_metrics_root]=[default]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [40] in Section [default] has Key/Value pair [host_type]=[FILER]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [41] in Section [default] has Key/Value pair [host_port]=[443]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [42] in Section [default] has Key/Value pair [host_enabled]=[1]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [43] in Section [default] has Key/Value pair [template]=[default]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [44] in Section [default] has Key/Value pair [data_update_freq]=[60]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [45] in Section [default] has Key/Value pair [ntap_autosupport]=[0]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [46] in Section [default] has Key/Value pair [latency_io_reqd]=[10]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [47] in Section [default] has Key/Value pair [auth_type]=[password]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [48] in Section [default] has Key/Value pair [username]=[netapp-harvest]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [49] in Section [default] has Key/Value pair [password]=[**********]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [50] in Section [default] has Key/Value pair [ssl_cert]=[INSERT_PEM_FILE_NAME_HERE]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [51] in Section [default] has Key/Value pair [ssl_key]=[INSERT_KEY_FILE_NAME_HERE]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [63] is Section [ntap01]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [64] in Section [ntap01] has Key/Value pair [hostname]=[ntap01]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [65] in Section [ntap01] has Key/Value pair [site]=[site1]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [67] is Section [ntap02]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [68] in Section [ntap02] has Key/Value pair [hostname]=[ntap02]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [69] in Section [ntap02] has Key/Value pair [site]=[site1]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [71] is Section [ntap03]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [72] in Section [ntap03] has Key/Value pair [hostname]=[ntap03]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [73] in Section [ntap03] has Key/Value pair [site]=[site1]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [75] is Section [ntap04]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [76] in Section [ntap04] has Key/Value pair [hostname]=[ntap04]
[2016-01-07 16:51:11] [DEBUG ] [conf] Line [77] in Section [ntap04] has Key/Value pair [site]=[site1]
[2016-01-07 16:51:11] [NORMAL ] [main] Poller will monitor a [FILER] at [ntap03:443]
[2016-01-07 16:51:11] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********]
[2016-01-07 16:51:11] [DEBUG ] [connect] Resolved hostname [ntap03] to IP address [XX.XX.XX.XX]
[2016-01-07 16:51:11] [DEBUG ] [connect] Reverse hostname lookup successful. Using HTTP/1.1 for communication.
Undefined subroutine &Net::SSLeay::load_error_strings called at /opt/netapp-harvest/lib/NaServer.pm line 388.

madden
8,209 Views

@NICKBARTON

 

It sounds like you don't have the Net::SSLeay module on your system which is a dependency in the SDK. You must have skipped or had an error when installing the Perl prerequisite modules listed on page 8 of the Harvest installation guide.  Basically do the sudo yum install or sudo apt-get install step again from the guide and verify all are installed without error.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

P.S.  Please select “Options” and then “Accept as Solution” if this response answered your question so that others will find it easily!

NICKBARTON
8,202 Views

@madden

 

Okay so had to go through it a couple times but that got my pollers working it looks like. Now its throwing an error connecting to graphite and it looks like something happened with my carbon-cache. I just when through setup again and still getting the same error. Any ideas? 

 

Starting carbon-cache (instance a)
Traceback (most recent call last):
File "/opt/graphite/bin/carbon-cache.py", line 32, in <module>
run_twistd_plugin(__file__)
File "/opt/graphite/lib/carbon/util.py", line 92, in run_twistd_plugin
runApp(config)
File "/usr/lib64/python2.6/site-packages/twisted/scripts/twistd.py", line 23, in runApp
_SomeApplicationRunner(config).run()
File "/usr/lib64/python2.6/site-packages/twisted/application/app.py", line 386, in run
self.application = self.createOrGetApplication()
File "/usr/lib64/python2.6/site-packages/twisted/application/app.py", line 446, in createOrGetApplication
ser = plg.makeService(self.config.subOptions)
File "/opt/graphite/lib/twisted/plugins/carbon_cache_plugin.py", line 21, in makeService
return service.createCacheService(options)
File "/opt/graphite/lib/carbon/service.py", line 131, in createCacheService
from carbon.writer import WriterService
File "/opt/graphite/lib/carbon/writer.py", line 38, in <module>
SCHEMAS = loadStorageSchemas()
File "/opt/graphite/lib/carbon/storage.py", line 125, in loadStorageSchemas
retentions = options['retentions'].split(',')
KeyError: 'retentions'
[FAILED]

 

Thank You,

Nick Barton 

NICKBARTON
8,192 Views

@madden

Nevermind this was due to a typo in my storage-schemas.conf file. I fixed the typo, restarted carbon-cache and netapp-harvest and all looks good now. Pollers are running and carbon is up and accepting datat. Thanks for the help will mark your last response as a solution. 

 

Thank You, 

Nick Barton

Public