ONTAP Discussions
ONTAP Discussions
Hello!
I've been breaking my head over an issue I'm having with a 4-node MetroCluster FC (so 2 FAS8200 nodes per site) running in a production environment.
So I have two LIFs on the same controller, hosted on the same LACP group (tested with both one and two member ports). Accessing volumes through the first LIF, I saw latencies of up to 23ms. Accessing these same volumes on the second LIF, I saw latencies of up to 1ms. I'm measuring against volumes hosted on the same controller, as well as on the other controller, which gives me the same results.
I've done the usual things:
Any help/suggestions are greatly appreciated! If you need any output of commands or logs, I'll provide them. As I have quite some output already, I didn't add it to the post as it would be a long post.
Thanks!
Where are you measuring the latency and which protocol? The port shouldn't affect the internal ONTAP latency, so I'd say get packet traces and review to make sure there isn't loss or something else.
Thanks for the reply, Paul!
All good questions and suggestions indeed.
Sorry for not mentioning it; I'm seeing the latencies over both NFS (over TCP) and CIFS. Actually, even a simple ping to the LIF shows the higher latencies.
At this point the "how did you measure latency over nfs/cifs" question is probably not relevant anymore, but I used both AIQUM and strace on linux (which shows the time it takes to open/list files).
I already did a tcpdump on the client's end, which shows no TCP retransmissions occurring. Or would you suggest doing the same on the controller's end?
I've got my 2 cents on the switch in between; So I'm also investigating that in parallel.
By the way, I'm running 9.7P6 at the moment.
Is a case open? Honestly I'd need to look at the data to know why.
no case is opened yet. I guess I'll do that then.
Thanks for the help on here anyway!