Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Anyone having any issues with Harvest 1.6 consuming 100% CPU ?
I have 2 clusters and 1 OCUM
if i take out 1 cluster and run it with 1 cluster its still 100% cpu usage
Tasks: 117 total, 3 running, 114 sleeping, 0 stopped, 0 zombie %Cpu(s): 43.0 us, 8.1 sy, 0.0 ni, 48.7 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem : 1843120 total, 959732 free, 276056 used, 607332 buff/cache KiB Swap: 0 total, 0 free, 0 used. 1354556 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7914 netapp-+ 20 0 216068 20896 3712 R 100.0 1.1 14:20.07 netapp-worker 8683 root 20 0 482196 41196 6100 S 2.0 2.2 0:04.84 svc2influxdb.py 5786 thollow+ 20 0 20376 5848 1172 S 0.3 0.3 0:05.52 nmon 1 root 20 0 128124 6700 4168 S 0.0 0.4 0:25.32 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
I upgraded the AWS instance and it still comsumed all available CPU.
If i add in the second cluster i get 2 netapp-worker process both consuming 100% CPU
Any ideas ?
Centos 7 , Full patched..
So i have upgraded the instance to m5a.large and still have the issue.
Solved! See The Solution
1 ACCEPTED SOLUTION
Greg_Wilson has accepted the solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
after weeks of stuffing aroud finally found out what the issue was.
im my harvest.conf
under the host i had this
[a1c34-cdot1]
hostname = 19.19.209.109
site = a1c34-lab
username = netapp-harvest
password = test1234
data_update_freq = 60
host_type = filer
once i removed the line
host_type = filer
everything worked.
CPU dropped to barely anything
and its now collecting perf data.
5 REPLIES 5
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Greg,
This is strange, do you observer 100% CPU usage all of the time? Normally Harvest pollers should be sleeping most of the time, so you should see CPU usage only for a few seconds each minute. (And even so, 100% CPU isn't something that you should normally see).
There is a few things you could do:
- Check Harvest logs to see if there are any warnings or errors.
- Check the Grafana dashboard of the Harvest poller, especially the graphs showing API time. If anything is higher than 10 seconds, might be part of the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Yes is consuming 100% of the CPU all the time.
Im wondering if its AWS and how they limit the CPU's for smaller instances..
https://forums.aws.amazon.com/thread.jspa?threadID=71347
This is the first time i have tried to deploy it in AWS. I have been running it using both
NABOX and Vmware (2cpu and 8gb RAM) on prem and its never been an issue.
I even dropped the polling of Cdot down to every 5min.
The logs for harvest look good.
Ill keep poking around..
This is a log for Cdot Cluster
[2019-11-20 12:20:32] [NORMAL ] [main] Startup complete. Polling for new data every [600] seconds.
[2019-11-20 12:23:19] [NORMAL ] WORKER STARTED [Version: 1.6] [Conf: netapp-harvest.conf] [Poller: lpaunetapp2001]
[2019-11-20 12:23:19] [NORMAL ] [main] Poller will monitor a [filer] at [10.x.x.x:443]
[2019-11-20 12:23:19] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********]
[2019-11-20 12:23:20] [NORMAL ] [main] Collection of system info from [10.x.x.x] running [NetApp Release 9.5P8] successful.
[2019-11-20 12:23:20] [NORMAL ] [main] Found best-fit monitoring template (same generation and major release, minor same or less): [cdot-9.5.0.conf]
[2019-11-20 12:23:20] [NORMAL ] [main] Added and/or merged monitoring template [/opt/netapp-harvest/template/default/cdot-9.5.0.conf]
[2019-11-20 12:23:20] [NORMAL ] [main] Metrics will be submitted with graphite_root [netapp.perf.lpau.lpaunetapp2001]
[2019-11-20 12:23:20] [NORMAL ] [main] Using graphite_meta_metrics_root [netapp.poller.perf.lpau.lpaunetapp2001]
[2019-11-20 12:23:20] [NORMAL ] Creating output plugins
[2019-11-20 12:23:20] [NORMAL ] Created output plugins
[2019-11-20 12:23:20] [WARNING] [wafl_hya_per_aggr] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [nfsv4:node] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [cifs:vserver] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [hostadapter] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [object_store_client_op] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [lun] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [wafl_comp_aggr_vol_bin] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [nfsv3] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [processor] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [offbox_vscan_server] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [nfsv4_1:node] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [offbox_vscan] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [cifs:node] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [volume:node] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [nfsv4] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [nfsv3:node] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [lif] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [wafl_hya_sizer] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [workload] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [iscsi_lif] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [fcvi] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [token_manager] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [disk:constituent] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [workload_volume] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [fcp_lif] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [system:node] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [copy_manager] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [resource_headroom_aggr] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [nic_common] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [volume] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [path] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [resource_headroom_cpu] Object type does not exist in Data ONTAP release; skipping
[2019-11-20 12:23:20] [WARNING] [wafl] Object type does not exist in Data ONTAP release; skipping
:
This is a log for OCUM
[2019-11-20 12:15:57] [NORMAL ] WORKER STARTED [Version: 1.6] [Conf: netapp-harvest.conf] [Poller: 10.19.21.235] [2019-11-20 12:15:57] [NORMAL ] [main] Poller will monitor a [OCUM] at [10.x.x.x:443] [2019-11-20 12:15:57] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********] [2019-11-20 12:15:57] [NORMAL ] [sysinfo] Discovered [lpaunetapp2001] on OCUM server and will submit metrics under group [lpau]. [2019-11-20 12:15:57] [NORMAL ] [sysinfo] Discovered [lpaunetapp0001] on OCUM server and will submit metrics under group [lpau]. [2019-11-20 12:15:57] [NORMAL ] [main] Collection of system info from [10.x.x.x] running [9.4] successful. [2019-11-20 12:15:57] [NORMAL ] [main] Found best-fit monitoring template (same generation and major release, minor same or less): [ocum-9.4.0.conf] [2019-11-20 12:15:57] [NORMAL ] [main] Added and/or merged monitoring template [/opt/netapp-harvest/template/default/ocum-9.4.0.conf] [2019-11-20 12:15:57] [NORMAL ] [main] Metrics for cluster [lpaunetapp0001] will be submitted with graphite_root [netapp.capacity.lpau.lpaunetapp0001] [2019-11-20 12:15:57] [NORMAL ] [main] Metrics for cluster [lpaunetapp2001] will be submitted with graphite_root [netapp.capacity.lpau.lpaunetapp2001] [2019-11-20 12:15:57] [NORMAL ] [main] Using graphite_meta_metrics_root [netapp.poller.capacity.lpau.10_19_21_235] [2019-11-20 12:15:57] [NORMAL ] Creating output plugins [2019-11-20 12:15:57] [NORMAL ] Created output plugins [2019-11-20 12:15:57] [NORMAL ] [main] Startup complete. Polling for new data every [900] seconds. [2019-11-20 12:20:31] [NORMAL ] WORKER STARTED [Version: 1.6] [Conf: netapp-harvest.conf] [Poller: 10.19.21.235] [2019-11-20 12:20:31] [NORMAL ] [main] Poller will monitor a [OCUM] at [10.x.x.x:443] [2019-11-20 12:20:31] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********] [2019-11-20 12:20:32] [NORMAL ] [sysinfo] Discovered [lpaunetapp2001] on OCUM server and will submit metrics under group [lpau]. [2019-11-20 12:20:32] [NORMAL ] [sysinfo] Discovered [lpaunetapp0001] on OCUM server and will submit metrics under group [lpau]. [2019-11-20 12:20:32] [NORMAL ] [main] Collection of system info from [10.x.x.x] running [9.4] successful. [2019-11-20 12:20:32] [NORMAL ] [main] Found best-fit monitoring template (same generation and major release, minor same or less): [ocum-9.4.0.conf] [2019-11-20 12:20:32] [NORMAL ] [main] Added and/or merged monitoring template [/opt/netapp-harvest/template/default/ocum-9.4.0.conf] [2019-11-20 12:20:32] [NORMAL ] [main] Metrics for cluster [lpaunetapp0001] will be submitted with graphite_root [netapp.capacity.lpau.lpaunetapp0001] [2019-11-20 12:20:32] [NORMAL ] [main] Metrics for cluster [lpaunetapp2001] will be submitted with graphite_root [netapp.capacity.lpau.lpaunetapp2001] [2019-11-20 12:20:32] [NORMAL ] [main] Using graphite_meta_metrics_root [netapp.poller.capacity.lpau.10_x_x_x] [2019-11-20 12:20:32] [NORMAL ] Creating output plugins [2019-11-20 12:20:32] [NORMAL ] Created output plugins [2019-11-20 12:20:32] [NORMAL ] [main] Startup complete. Polling for new data every [900] seconds. [2019-11-20 12:23:19] [NORMAL ] WORKER STARTED [Version: 1.6] [Conf: netapp-harvest.conf] [Poller: 10.x.x.x] [2019-11-20 12:23:19] [NORMAL ] [main] Poller will monitor a [OCUM] at [10.x.x.x:443] [2019-11-20 12:23:19] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********] [2019-11-20 12:23:19] [NORMAL ] [sysinfo] Discovered [lpaunetapp2001] on OCUM server and will submit metrics under group [lpau]. [2019-11-20 12:23:19] [NORMAL ] [sysinfo] Discovered [lpaunetapp0001] on OCUM server and will submit metrics under group [lpau]. [2019-11-20 12:23:19] [NORMAL ] [main] Collection of system info from [10.x.x.x] running [9.4] successful. [2019-11-20 12:23:19] [NORMAL ] [main] Found best-fit monitoring template (same generation and major release, minor same or less): [ocum-9.4.0.conf] [2019-11-20 12:23:19] [NORMAL ] [main] Added and/or merged monitoring template [/opt/netapp-harvest/template/default/ocum-9.4.0.conf] [2019-11-20 12:23:19] [NORMAL ] [main] Metrics for cluster [lpaunetapp0001] will be submitted with graphite_root [netapp.capacity.lpau.lpaunetapp0001] [2019-11-20 12:23:19] [NORMAL ] [main] Metrics for cluster [lpaunetapp2001] will be submitted with graphite_root [netapp.capacity.lpau.lpaunetapp2001] [2019-11-20 12:23:19] [NORMAL ] [main] Using graphite_meta_metrics_root [netapp.poller.capacity.lpau.10_x_x_x] [2019-11-20 12:23:19] [NORMAL ] Creating output plugins [2019-11-20 12:23:19] [NORMAL ] Created output plugins [2019-11-20 12:23:19] [NORMAL ] [main] Startup complete. Polling for new data every [900] seconds.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That might explain it, 2 CPUs should be complete fine for two pollers. I run 10 pollers on 2 CPUs and I'm stil fine.
But the warnings in the first log also look odd. What is your cDot version? Even if you have a very old release, I don't think so many object types should be unavailable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I shouldn't be asking your release since it right up there in the log 🙂
No this isn't right for Ontap 9.5. If you want us to take a closer look, run your poller in vebose mode and share the logs with me (either here or just send to me).
Greg_Wilson has accepted the solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
after weeks of stuffing aroud finally found out what the issue was.
im my harvest.conf
under the host i had this
[a1c34-cdot1]
hostname = 19.19.209.109
site = a1c34-lab
username = netapp-harvest
password = test1234
data_update_freq = 60
host_type = filer
once i removed the line
host_type = filer
everything worked.
CPU dropped to barely anything
and its now collecting perf data.
![](/skins/images/F656501F534EE6F07310F744781865A1/responsive_peak/images/icon_anonymous_message.png)