ONTAP Discussions

Latency difference between LIFs on same port

MaximRonsse

Hello!

 

I've been breaking my head over an issue I'm having with a 4-node MetroCluster FC (so 2 FAS8200 nodes per site) running in a production environment.

 

So I have two LIFs on the same controller, hosted on the same LACP group (tested with both one and two member ports). Accessing volumes through the first LIF, I saw latencies of up to 23ms. Accessing these same volumes on the second LIF, I saw latencies of up to 1ms. I'm measuring against volumes hosted on the same controller, as well as on the other controller, which gives me the same results.

 

I've done the usual things:

  • Checked CPU/disk utilization (sysstat -x on all controllers): less than 25% CPU usage in a few minutes, less than 70% disk usage.
  • Checked volume latency (qos statistics volume performance): 50µs latency. 
  • Checked port usage through AIQUM: one physical port was using 9% of 10Gbps
  • There is no duplicate IP in the subnet

 

Any help/suggestions are greatly appreciated! If you need any output of commands or logs, I'll provide them. As I have quite some output already, I didn't add it to the post as it would be a long post.

 

Thanks!

5 REPLIES 5

Re: Latency difference between LIFs on same port

paul_stejskal

Where are you measuring the latency and which protocol? The port shouldn't affect the internal ONTAP latency, so I'd say get packet traces and review to make sure there isn't loss or something else.

Re: Latency difference between LIFs on same port

MaximRonsse

Thanks for the reply, Paul!

 

All good questions and suggestions indeed. 

 

Sorry for not mentioning it; I'm seeing the latencies over both NFS (over TCP) and CIFS. Actually, even a simple ping to the LIF shows the higher latencies.

 

At this point the "how did you measure latency over nfs/cifs" question is probably not relevant anymore, but I used both AIQUM and strace on linux (which shows the time it takes to open/list files).

 

I already did a tcpdump on the client's end, which shows no TCP retransmissions occurring. Or would you suggest doing the same on the controller's end?

 

I've got my 2 cents on the switch in between; So I'm also investigating that in parallel. 

 

By the way, I'm running 9.7P6 at the moment.

Re: Latency difference between LIFs on same port

paul_stejskal

Is a case open? Honestly I'd need to look at the data to know why.

Re: Latency difference between LIFs on same port

MaximRonsse

no case is opened yet. I guess I'll do that then.

 

Thanks for the help on here anyway!

Re: Latency difference between LIFs on same port

paul_stejskal
Maxim,

Sorry man! Without data it's hard to say what is going on, so I can't confirm. A case will let me or someone on my team (Perf L2, or probably more likely NAS L2) review the data.
Earn Rewards for Your Review!
GPI Review Banner
All Community Forums
Public