Subscribe
Accepted Solution

NetApp Harvest error: [nic_common] plugin failed to compile

[ Edited ]

Collection on one of my 8.3.2.P2 clusters stopped with the errors below logged. All other clusters seem to be fine. Has anyone seen this?

 

 

[2016-06-21 11:36:00] [NORMAL ] Poller status: status, secs=14400, api_time=8170, plugin_time=274, metrics=1978019, skips=587, fails=0
[2016-06-21 13:00:42] [WARNING] [nic_common] plugin failed to compile: Illegal division by zero at /opt/netapp-harvest/plugin/cdot-nic-common line 86.

 

[2016-06-21 13:00:42] [ERROR ] [nic_common] Restarting netapp-worker as an attempt to clear issue
[2016-06-21 13:00:42] [NORMAL ] WORKER STARTED [Version: 1.2.2] [Conf: netapp-harvest.conf] [Poller: ntap-cla01]
[2016-06-21 13:00:42] [NORMAL ] [main] Poller will monitor a [FILER] at [192.168.94.1:443]
[2016-06-21 13:00:42] [NORMAL ] [main] Poller will use [password] authentication with username [netapp-harvest] and password [**********]
[2016-06-21 13:00:43] [NORMAL ] [main] Collection of system info from [192.168.94.1] running [NetApp Release 8.3.2P2] successful.
[2016-06-21 13:00:43] [NORMAL ] [main] Using best-fit collection template: [cdot-8.3.0.conf]
[2016-06-21 13:00:43] [NORMAL ] [main] Using graphite_root [netapp.perf.springfield.ntap-cla01]
[2016-06-21 13:00:43] [NORMAL ] [main] Using graphite_meta_metrics_root [netapp.poller.perf.springfield.ntap-cla01]
[2016-06-21 13:00:43] [NORMAL ] [smb2:node] Collection of object not enabled; skipping
[2016-06-21 13:00:43] [NORMAL ] [smb2:vserver] Collection of object not enabled; skipping
[2016-06-21 13:00:43] [NORMAL ] [main] Startup complete. Polling for new data every [60] seconds.
[2016-06-21 13:02:39] [WARNING] [nic_common] plugin failed to compile: Illegal division by zero at /opt/netapp-harvest/plugin/cdot-nic-common line 86.

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

no, and I was about to push P2 to a DEV cluster.  This was running fine against P1?

 

 

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

It's working on other 832P2 clusters and had been working fine after we upgraded. For at least 2 weeks. Not sure why this stopped collecting.

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

[ Edited ]

FYI, in order to pull back any metrics I had to comment out these lines in "/opt/netapp-harvest/plugin/cdot-nic-common"

 

my $rx_pct = sprintf ("%.2f", $h{$start}{$port}{rx_bytes_per_sec} / $link_speed * 100 );
my $tx_pct = sprintf ("%.2f", $h{$start}{$port}{tx_bytes_per_sec} / $link_speed * 100 );
my $pct = sprintf ("%.2f", $tx_pct);
$pct = sprintf ("%.2f", $rx_pct) if ($rx_pct > $tx_pct);
push @emit_items, "$start.$port.rx_pct_util $rx_pct $timestamp";
push @emit_items, "$start.$port.tx_pct_util $tx_pct $timestamp";
push @emit_items, "$start.$port.link_pct_util $pct $timestamp";

 

I realize this is not a solution, but I need to collect something vs nothing and as I said, I only experienced this on one cluster. The others are fine. And it had been working previously after 8.3.2P2 upgrade. It's a 14 node NFS cluster. After a certain date, collection failed with [WARNING] [nic_common] plugin failed to compile: Illegal division by zero at /opt/netapp-harvest/plugin/cdot-nic-common.

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

Hi @dlmaldonado

 

There appears to be an issue with the link_speed counter value on some interface(s) on your cluster.  My guess is either something changed in 8.3.2P2, or after the upgrade/reboot some unused interface didn't get a value set as it should (which could also be a new behavior in 8.3.2P2).

 

 

Can you restart the poller in verbose mode, wait for 5 minutes, and then restart again in normal mode?:

 

/opt/netapp-harvest/netapp-manager -restart -poller <clustername> -v

<wait 5 minutes>
/opt/netapp-harvest/netapp-manager -restart -poller <clustername>

Then provide the logfile in /opt/netapp-harvest/log/<poller>_netapp-harvest.log

 

From that log I can see what the incoming link_speed values are and hopefully explain why it's not working as it should.

 

 

I will also send you a private message in case you prefer to share the logs privately.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!

 

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

[ Edited ]

Hi

We are using harvest to get performance on cDOT8.2.3 and 8.3

Aug 23 8.2.3 upgrading to 8.2.4P4

after upgrade the same error has occurred

 

When you confirm the netapp-dashboard-cluster of grafana
eth port utilization is greater than 3000 percent

 

dlmaldonado wrote

"/opt/netapp-harvest/plugin/cdot-nic-common"

--------------------------------------------------------------------------------------------------------

my $rx_pct = sprintf ("%.2f", $h{$start}{$port}{rx_bytes_per_sec} / $link_speed * 100 );
my $tx_pct = sprintf ("%.2f", $h{$start}{$port}{tx_bytes_per_sec} / $link_speed * 100 );
my $pct = sprintf ("%.2f", $tx_pct);
$pct = sprintf ("%.2f", $rx_pct) if ($rx_pct > $tx_pct);
push @emit_items, "$start.$port.rx_pct_util $rx_pct $timestamp";
push @emit_items, "$start.$port.tx_pct_util $tx_pct $timestamp";
push @emit_items, "$start.$port.link_pct_util $pct $timestamp";

--------------------------------------------------------------------------------------------------------

 

How to fix this code?

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

Hi @hashiya1112

 

Actually, we resolved offline.  One of the ports was link up but at 10Mbit and the plugin logic was not able to convert this correctly.  I have added a fix and it will ship in the next Harvest release on the toolchest.  In the meantime perhaps you can just find the port(s) that are online at 10Mbit and fix that to be 100Mbit or faster?

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

Hi Chris

 

thank you for reply

 

There is an offline port

Not 10Mbit in online port

 

Do I change the network port modify command?

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

I've nic_common tried to edit it as follows

 

		{
			$link_speed = 1.25 if ($h{$start}{$port}{link_speed} == 10000000 );  #10Mbit
			$link_speed = 12.5 if ($h{$start}{$port}{link_speed} == 100000000 );  #100Mbit
			$link_speed = 125  if ($h{$start}{$port}{link_speed} == 1000000000 );  #1Gbit
			$link_speed = 1250 if ($h{$start}{$port}{link_speed} == 10000000000 ); #10Gbit
		}
		elsif ($connection{normalized_xfer} eq 'kb_per_sec')
		{
			$link_speed = 1250 if ($h{$start}{$port}{link_speed} == 10000000 );  #10Mbit
			$link_speed = 12500 if ($h{$start}{$port}{link_speed} == 100000000 );  #100Mbit
			$link_speed = 125000  if ($h{$start}{$port}{link_speed} == 1000000000 );  #1Gbit
			$link_speed = 1250000 if ($h{$start}{$port}{link_speed} == 10000000000 ); #10Gbit
		}
		elsif ($connection{normalized_xfer} eq 'b_per_sec')
		{
			$link_speed = 1250000 if ($h{$start}{$port}{link_speed} == 10000000 );  #10Mbit
			$link_speed = 12500000 if ($h{$start}{$port}{link_speed} == 100000000 );  #100Mbit
			$link_speed = 125000000  if ($h{$start}{$port}{link_speed} == 1000000000 );  #1Gbit
			$link_speed = 1250000000 if ($h{$start}{$port}{link_speed} == 10000000000 ); #10Gbit
		}
		elsif ($connection{normalized_xfer} eq 'gb_per_sec')
		{
			$link_speed = .00125 if ($h{$start}{$port}{link_speed} == 10000000 );  #10Mbit
			$link_speed = .0125 if ($h{$start}{$port}{link_speed} == 100000000 );  #100Mbit
			$link_speed = .125  if ($h{$start}{$port}{link_speed} == 1000000000 );  #1Gbit
			$link_speed = 1.25  if ($h{$start}{$port}{link_speed} == 10000000000 ); #10Gbit
		}

error is no longer out

but Calculation of eth port utilization percent became strange

e0M(node management port) utilization 3820 percent....

e0M is 100Mbit port

 

Hmm....

 

Re: NetApp Harvest error: Illegal division by zero at /opt/netapp-harvest/plugin/cdo

Hi @hashiya1112

 

Maybe give this a try:

 

		my $link_speed = 1;
		if ($connection{normalized_xfer} eq 'mb_per_sec')
		{
			$link_speed = $h{$start}{$port}{link_speed} / 8000000;
		}
		elsif ($connection{normalized_xfer} eq 'kb_per_sec')
		{
			$link_speed = $h{$start}{$port}{link_speed} / 8000;
		}
		elsif ($connection{normalized_xfer} eq 'b_per_sec')
		{
			$link_speed = $h{$start}{$port}{link_speed} / 8;
		}
		elsif ($connection{normalized_xfer} eq 'gb_per_sec')
		{
			$link_speed = $h{$start}{$port}{link_speed} / 8000000000;
		}
		next if ($link_speed == 1); # Skip posting utilization if we couldn't normalize

 

If you still see a weird utilization check higher in this post for instructions on how to collect logs needed to understand what is happening.  Send me these logs in a private message.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!