Subscribe
Accepted Solution

Discrepancy in performance data reported by the Powershell Toolkit and DFM. Need urgent help to understand why?

Hi,

We are trying to use the powershell toolkit to retrieve avg_latency and total_ops counter values by volumes in our clustered environment. But we are seeing massive discrepancies in the reported data by both the systems.

Below are the details for the latency counter:

DFM :

Command : dfm perf data retrieve -o 433 -C volume:avg_latency -d 3600 -S simple -m max

Output:

Timestamp       433:XXXX:/XXXX:avg_latency

-------------------------------------------------------------------------------

2013-10-24 09:44:21     293.443

2013-10-24 09:49:21     80.594

2013-10-24 09:54:21     60.473

2013-10-24 09:59:21     35.428

2013-10-24 10:04:21     65.163

2013-10-24 10:09:20     37.584

2013-10-24 10:14:21     334.419

2013-10-24 10:19:21     1170.279

2013-10-24 10:24:21     377.154

2013-10-24 10:29:21     352.430

2013-10-24 10:34:21     172.076

Powershell:

Command: Get-NcPerfData -Name volume -Counter "avg_latency" -Instance “XXXX”

Output :

PS G:\> Get-NcPerfData -Name volume -Counter "avg_latency" -Instance "XXXX"

Name                   Uuid                                     Counters

----                   ----                                     --------

XXXX              bf9e4ac9-dd8f-4a04-8f45-9ab47c6bc8b6     {avg_latency}

PS G:\> $g = Get-NcPerfData -Name volume -Counter "avg_latency" -Instance "XXXX"

PS G:\> $g.counters

Name                                    NcController                            Value

----                                    ------------                            -----

avg_latency                             XXXX                          20742561312


The latency shown by poweshell is 20742.561312 s and by DFM is 293.443 s.

I would like to understand why this discrepancy exists and if I am doing something wrong.

Thanks!

Re: Discrepancy in performance data reported by the Powershell Toolkit and DFM. Need urgent help to understand why?

This is cool.. I'm a netapp ps junkie and I never played with this cmdlet...

But, what I do is, I wrap the dfm data with powershell and i'm able to predict storage growth based on dfm numbers.. Also, I put latency numbers as well and put it in excel..

But i'm interested in the above answer for you as well

Re: Discrepancy in performance data reported by the Powershell Toolkit and DFM. Need urgent help to understand why?

When collecting perf counter data in PowerShell, you need to be aware of the type of counters you are collecting data from.  Looking at the "avg_latency" counter, we see the BaseCounter is "total_ops".

[3.0] 10.61.167.254> Get-NcPerfCounter -Name volume | ? { $_.Name -eq "avg_latency" }

AggregationStyle        :

BaseCounter             : total_ops

Desc                    : Average latency in microseconds for the WAFL filesystem to process all the operations on the

                          volume; not including request processing or network communication time

IsKey                   :

Labels                  :

Name                    : avg_latency

NcController            : 10.61.167.254

PrivilegeLevel          : basic

Properties              : average

TranslationInputCounter :

Type                    :

Unit                    : microsec

IsKeySpecified          : False

With this information, we know that we can calculate the avg_latency value (in microseconds) by reading the values of the avg_latency and total_ops counters twice (separated by some time), then do the calculation:

(avg_latency[1] - avg_latency[0]) / (total_ops[1] - total_ops[0])

Where avg_latency[0] and total_ops[0] are the values from the first reading and avg_latency[1] and total_ops[1] are the values from the second reading.

Here is some rough PowerShell code (you can probably clean up pulling the data values from the CounterData):

[3.0] 10.61.167.254> $perfdata1 = Get-NcPerfData -Name volume -InstanceUuid 3e964011-7259-11dc-a5ef-123478563412 -Counter avg_latency, total_ops

(wait a few seconds...)

[3.0] 10.61.167.254> $perfdata2 = Get-NcPerfData -Name volume -InstanceUuid 3e964011-7259-11dc-a5ef-123478563412 -Counter avg_latency, total_ops

[3.0] 10.61.167.254> $avg_latency = @( ($perfdata1.Counters | ? { $_.Name -eq 'avg_latency' }).Value, ($perfdata2.Counters | ? {$_.Name -eq 'avg_latency' } ).Value )

[3.0] 10.61.167.254> $total_ops = @( ($perfdata1.Counters | ? { $_.Name -eq 'total_ops'}).Value, ($perfdata2.Counters | ? {$_.Name -eq 'total_ops'} ).Value )

[3.0] 10.61.167.254> ($avg_latency[1] - $avg_latency[0]) / ($total_ops[1] - $total_ops[0])

48.2258064516129

Also, check out the Invoke-NcSysstat cmdlet which will give you a value for the avg_latency counter in the form of a .NET TimeSpan object:

[3.0] 10.61.167.254> Invoke-NcSysstat -Volume clusterdisks -Count 5 | select Name, TotalLatency

Name                                                        TotalLatency

----                                                        ------------

clusterdisks                                                00:00:00.0001083

clusterdisks                                                00:00:00.0001850

clusterdisks                                                00:00:00.0000703

clusterdisks                                                00:00:00.0000268

clusterdisks                                                00:00:00.0001520

-Steven

Re: Discrepancy in performance data reported by the Powershell Toolkit and DFM. Need urgent help to understand why?

Thanks for the explanation Steven! This works! Appreciate your help!