Re: What does the other_latency counter mean?

keahey · ‎2011-06-10

While doing some performance advisor stats with a customer, we noticed a counter other_latency. What does this counter mean and how do we explain it to our customers.

arndt · ‎2011-06-10

That is latency for operations other than reads or writes. For example, in an NFS environment, metadata operations such as GETATTR and ACCESS calls would be other_ops that are measured for response time with other_latency.

Mike

adaikkap · ‎2011-06-12

Basically these are operation done by the system, and not the user.

Regards

adai

arndt · ‎2011-06-20

No, other_latency and other_ops are not related to system work. They are related to protocol operations initiated by a client, but they are not read or write operations.

Mike

adaikkap · ‎2011-06-20

Thanks mike for correcting me.So what are they ? if not read or write ?

Regards

adai

arndt · ‎2011-06-20

I don’t have a full list, but there are a number of different types of operations in a NFS environment that are not reads or writes. Examples would be GETATTR, ACCESS, and LOOKUP calls.

MIke

reide · ‎2011-07-21

The nfsv3 counter object in DataONTAP tells you what all these 'other' operations are for NFS. I'm pretty sure there is a similar one for CIFS. Here is a list of the NFSv3 operations other than read & write:

null, getattr, setattr, lookup, access, readlink, create, mkdir, symlink, mknod, remove, rmdir, rename, link, readdir, readdirplus, fsstat, fsinfo, pathconf, commit

There are three counters in the nfsv3 counter object that are extremely useful for monitoring these "other" operations if you need to. You can create custom views in Performance Advisor that will show each of these counters over time. Very useful for troubleshooting "chatty" NFS applications that generate a lot of 'other' NFS operations.

Name: nfsv3_op_count
Description: Array of select NFS v3 operation counts
Properties: delta
Unit: none
Size: 22 column array
Column names: null, getattr, setattr, lookup, access, readlink, read, write, create, mkdir, symlink, mknod, remove, rmdir, rename, link, readdir, readdirplus, fsstat, fsinfo, pathconf, commit

Name: nfsv3_op_percent
Description: Array of select NFS v3 operations as a percentage of total NFS v3 operations
Properties: percent
Unit: percent
Size: 22 column array
Column names: null, getattr, setattr, lookup, access, readlink, read, write, create, mkdir, symlink, mknod, remove, rmdir, rename, link, readdir, readdirplus, fsstat, fsinfo, pathconf, commit
        Base Name: nfsv3_ops
        Base Description: Total number of NFS v3 operations per second
        Base Properties: rate
        Base Unit: per_sec

Name: nfsv3_op_latency
Description: Array of latencies of select NFS v3 operations
Properties: average
Unit: microsec
Size: 22 column array
Column names: null, getattr, setattr, lookup, access, readlink, read, write, create, mkdir, symlink, mknod, remove, rmdir, rename, link, readdir, readdirplus, fsstat, fsinfo, pathconf, commit
        Base Name: nfsv3_op_latency_base
        Base Description: Array of select NFS v3 operation counts for latency calculation
        Base Properties: delta,no-display
        Base Unit: none
        Base Size: 22 column array
        Base Column names: null, getattr, setattr, lookup, access, readlink, read, write, create, mkdir, symlink, mknod, remove, rmdir, rename, link, readdir, readdirplus, fsstat, fsinfo, pathconf, commit

jim_dewaard · ‎2011-08-04

Hey Keahey,

We had the same exact issue. Other_ops were showing up in perf advisor and completely throwing off our graphs. After failed troubleshooting with both NetApp and VMware we finally determined that the Veeam Monitoring platform was constantly enumerating our NFS datastores. After shutting off the Veeam collector service, other_ops went away. I believe Veeam released an update which fixes this behavior, but I wouldn't doubt that other monitoring platforms could cause the same issue.

Hope that helps.

Thanks,

mcorbeille · ‎2011-12-01

Does anyone know what kind of operations would be constituted as other_ops and included in other_latency for a volume that is accessed by FCP only? I have a customer that is seeing sub 5ms latency for both reads and writes but he is concerned about his other_latency spikes of 40ms. Any assistance would be appreciated.

Thanks,

Mike

chriszurich · ‎2012-02-29

I am running into this problem as well. I'm seeing 600 millisecond other_latency spikes within the Volume Latency View in performance advisor. This is a Fibre Channel LUN which is storing Oracle Data files.

arndt · ‎2012-02-29

My experience has changed since the comments I added to this thread last June. I had typically seen only protocol based traffic and latency data in the volume based counters, back in the 7.3 days. With 8.0, I have confirmation that some system operations can show up in these counters, and I've seen that on the systems I've looked at as well.

There is a way though to narrow down this issue to determine if there are system operations or protocol operations that are causing the volume latency to be high. There are a set of volume counters for every protocol that can be used if you enable them with the "dfm options set perfAdvisorShowDiagCounters=Enabled" command. For example, instead of using counter like this:

volume:other_ops

volume:other_latency

volume:read_ops

volume:read_latency

...etc...

Use the protocol based counters that look like this:

volume:fcp_other_ops

volume:fcp_other_latency

volume:fcp_read_ops

volume:fcp_read_latency

...etc...

Those volume protocol counters exist for all protocols (fcp, iscsi, cifs, and nfs). Compare the volume protocol counters to the default volume counters, and see if the volume protocol counters are more in line with what you are expecting. Hope that helps,

Mike

thigian007 · ‎2012-03-15

Is it possible that LUN missalignment could cause Other_latency to go through the roof like this?

arndt · ‎2012-03-16

I don't believe so. Also, I have seen this increase for non-protocol other_latency on volumes where there was no mis-alignment (and not even any LUNs).

thigian007 · ‎2012-03-16

Is Other_IO prioritized differently than Reads and Writes? The reason I ask is that the latency is just off the charts and doesn't seem to coincide with how busy the disks are.

arndt · ‎2012-03-19

System level work would be prioritized lower than protocol level work...which is likely why you will see high volume other_latency values that don't line up with the protocol specific other_latency counters if you look at each one individually.

TMADOCTHOMAS · ‎2014-08-20

Hello,

I am using Performance Advisor to investigate performance issues on our FAS3270s. I have pinpointed several volumes with high latency, however in several cases it is due to "other" latency. This is true for several NFS, CIFS, and iSCSI volumes. I'm not seeing a clear-cut resolution in this thread to the question of how to determine exactly what "other" latency consists of on a volume. Does anyone else have any insight on this? Is the only option to open a case with NetApp Support to determine the cause on an individual basis?