Software Development Kit (SDK) and API Discussions

cDOT Latency Counters Reporting Zero

richard5
5,683 Views

Hi, when retrieving the "kernel:system" object items such as sys_write_latency_hist ,sys_read_latency_hist, sys_read_latency, sys_avg_latency, and sys_write_latency are zero while other ops, cpu, system_model all have correct data.

Is this a API issue or ONTAP counter issue?

 

thanks

5 REPLIES 5

ekashpureff
5,683 Views

Richard -

I didn't take the time to look at the definitions of the counters, but - my best edumucated guess ...

The kernel internaly doesn't have measurable latency.

It's handling all of the internal interupts faster than the granularity of the defined instance counters.

You could look at network, disk, and protocol based latency counters ?

I hope this response has been helpful to you.

At your service,

Eugene E. Kashpureff, Sr.

Independent NetApp Consultant, K&H Research http://www.linkedin.com/in/eugenekashpureff

Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp

(P.S. I appreciate points for helpful or correct answers.)

richard5
5,682 Views

That would seem to be a useless set of counters.The catalog gives the following descriptions:

    sys_avg_latency             Average latency for all operations in the

                                system in milliseconds

    sys_read_latency            Average latency for all read operations in

                                the system in milliseconds

    sys_write_latency           Average latency for all write operations in

                                the system in milliseconds

which seem legitimate.

madden
5,667 Views

Hi,

 

These system latency counters that work in 7-mode don't in cDOT right now.  The reason is that in cDOT the 'frontend' protocol work and the 'backend' volume work can occur on different nodes and these specific counters haven't been updated yet to aggregate and compute this latency info.  The issue is tracked as bug ID 830366.  

 

But, in cDOT we have much more sophisticated counters as part of the QoS infrastructure, so I would recommend to take a look at these.  After setting up policy groups (they can have no limit, i.e. a limit of INF) use CLI "qos statistics workload latency show" to see latency per layer of network/cluster/data/disk/QOS.  I haven't used the APIs for these yet, but I think they are in object workload<something> family.  Or, if you want a high level view that is easy to implement, the protocol specific counters for nfs/smb/iscsi/fcp, and then at the node and/or SVM level, are probably your best ones.

 

Cheers,

Chris Madden
Storage Architect, NetApp EMEA

madden
5,647 Views

Hi Richard,

 

I was reviewing work for a different customer and realized you can get the effect of these system counters in a different way.  In cDOT we also have some aggregated counter objects and there is one for volume called object "volume:node".  So this is the sum/average of all volumes on a given node and has the latency you are looking for: read_latency write_latency avg_latency.

 

Please give these a try!

 

Cheers,

Chris Madden

Storage Architect, NetApp EMEA

tflammger
5,362 Views

@madden wrote:

 I haven't used the APIs for these yet, but I think they are in object workload<something> family.

 

Cheers,

Chris Madden
Storage Architect, NetApp EMEA


 

 

Thanks, you're right... I've been looking everywhere for the "qos statistics performance *" in the ZAPI so I can get some long-run data for a node balancing exercise.

They're burried under perf as workload objects.

 

<object-info>
<description>The workload CM object that provides round-trip statistical information.</description>
<name>workload</name>
<privilege-level>basic</privilege-level>
</object-info>

<object-info> <description>The workload CM object that provides round-trip statistical information.</description> <name>workload:constituent</name> <privilege-level>basic</privilege-level> </object-info>
<object-info> <description>The workload CM object that provides round-trip statistical information.</description> <name>workload:policy_group</name> <privilege-level>basic</privilege-level> </object-info>
<object-info> <description>The workload_detail CM object that provides service center-based statistical information. Note: this object returns a very large number of instances. Querying by instance name and using wild cards may improve response times.</description> <name>workload_detail</name> <privilege-level>advanced</privilege-level> </object-info>

 Through perf --> workload: policy_group counters can get at the same data as a "qos statistics performance show -iterations 1" CLI call

 

 

Public