Active IQ Unified Manager Discussions

NetApp-Harvest: no data for some ISCSI metrics?

DARREN_REED
5,020 Views

One of the messages that shows up in the NetApp Harvest log is this:

 

[2016-01-18 12:59:45] [WARNING] [iscsi_lif] instance data poll had matching name and uuid [ff58c09d-fccd-11e3-af4d-123478563412] so not adding it in this partial correct state to instance cache
[2016-01-18 12:59:45] [WARNING] [iscsi_lif] instance data poll had matching name and uuid [ffa08aaa-f771-11e3-bac7-123478563412] so not adding it in this partial correct state to instance cache

 

... there are many of these.

 

Is this related to "ISCSI FRONTEND DRILLDOWN" (under "NetApp Dashboard: Node") showing no graphs for the "Throughput" row?

 

Also there are no data graphs at all for "ISCSI FRONTEND DRILLDOWN" under "NetApp Dashboard: SVM", only "N/A" graphs.

7 REPLIES 7

madden
4,973 Views

Hi @DARREN_REED,

 

There is a Data ONTAP bug #764178 whereby the UUID to instance name translation may fail if the cluster is large and/or busy.  I worked around this in Harvest for LUN and fcp_lif objects and it looks like I need to do the same for iscsi_lif.  I will add to the next toolchest release which I don't have an ETD for at the moment.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

P.S.  Please select “Options” and then “Accept as Solution” if this response answered your question so that others will find it easily!

 

DARREN_REED
4,920 Views

Is fixing this more or less taking this section of code from netapp-worker:

 

elsif ($obj eq 'fcp_lif') ## Workaround bug 764178 whereby lif names are not returned in some cases via perf APIs
{
$in = NaElement->new('net-interface-get-iter');
$in->child_add_string("max-records",$batch_size);
my $query = new NaElement('query');
my $netInterfaceInfo = new NaElement('net-interface-info');
my $dataProtocols = new NaElement('data-protocols');
$in->child_add($query);
$query->child_add($netInterfaceInfo);
$netInterfaceInfo->child_add($dataProtocols);
$dataProtocols->child_add_string('data-protocol', 'fcp');
my $desiredAttributes = new NaElement('desired-attributes');
$in->child_add($desiredAttributes);
my $queryNetInterfaceInfo = new NaElement('net-interface-info');
$desiredAttributes->child_add($queryNetInterfaceInfo);
$queryNetInterfaceInfo->child_add_string('interface-name','');
$queryNetInterfaceInfo->child_add_string('lif-uuid','');
}

 

and replacing "fcp" with "iscsi"?

madden
4,901 Views

Hi,

 

Some code updates are required where you quote, and more a little later in netapp-worker, and also in the plugin.  So a few changes...

 

Cheers,

Chris

DARREN_REED
4,872 Views

patches welcome for testing 🙂

madden
4,865 Views

Patch provided.  If someone else needs it for harvest 1.2.2 let me know via private message.

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

P.S.  Please select “Options” and then “Accept as Solution” if this response answered your question so that others will find it easily!

 

DARREN_REED
4,850 Views

After the patch I have not seen any iscsi_lif uuid messages.

 

The only message I see now is "[WARNING] [workload_detail] data-list poller next refresh at [2016-01-28 12:42:00] not scheduled because it occurred in the past"

 

Since the patch I now see data showing up on the 'iSCSI Read IOPs" and "iSCSI Write IOPs" under the "ISCSI LIF DRILLDOWN" dashboard.

 

The "iSCSI Read Data" and "iSCSI Write Data" are still showing "No datapoints" for the individual graphs and "iSCSI Sent Throughput" and "iSCSI Receive Throughput" both display "N/A MB/s".

 

Similarly, the "Throughput" "iSCSI" panels on the "ISCSI FRONTEND DRILLDOWN" for the SVM dashboard are empty.

 

One of the changes relates to removal of the SVM in output that goes "SVM_LIF". On the "TOP iSCSI Read Latency per LIF" graph, I see "lifa" as well as "svm_lifa" but there is only data for "lifa".

 

Will removing all of the files and directories under /opt/graphite/storage/whisper/netapp/perf/*/*/svm/*/iscsi_lif/*_* clean the graphs up? (None of the files in those directories are being updated now.)

Or is there some residual reference elsewhere?

 

madden
4,836 Views

Hi @DARREN_REED

 

The warning message you have about "not scheduled because it occurred in the past" means the total time to poll all object types is greater than the polling frequency and a poll had to be skipped.  If you see this once in a while I would just ignore but if it is consistent you might want to adjust polling interval or counters collected.  The admin guide has some discussion on this warning message and actions you might want to take.  It could be that with collection of iscsi_lif working your total collection time has increased to more than the polling frequency. 

 

Throughput stats for iSCSI did not exist in Data ONTAP 8.2 so if your cluster is running that release any graph or panel is expected to display N/A or no data points.  The required counters, read_data and write_data, were added in Data ONTAP 8.3.

 

On the question about:

>>"svm_lifa" but there is only data for "lifa"

 

 

I'm not quite sure what you mean.  Maybe you renamed a LIF?  Maybe you can show a screenshot or ls from the graphite server directory so I can better understand?  If graphs are not being updated and are stale (like a vol was deleted or a lif was renamed) you can indeed just remove the directories and files; there is no record elsewhere.  Should that metric be sent again then Graphite will discover there is no metrics file to update and will create a new one according to the frequency and retention settings in storage-schemas.conf.  You can also merge data points from old to new metrics files (for example if you had a rename) using whisper utilities like whisper-merge.py to preserve your existing datapoints.

 

Hope this helps!

 

 

Cheers,
Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data

 

P.S.  Please select “Options” and then “Accept as Solution” if this response answered your question so that others will find it easily!

 

Public