ONTAP Discussions

Any limit in the number of NFS mounts on a LIF ?

netappmagic

Is there a recommended or a limit on how many   NFS mounts should have on a LIF associated with a node?

 

 

12 REPLIES 12

netappmagic

@parisi , 

Thanks!

 

NFS datastores (layer-2) or volumes(layer-3) are heavily used here. NFS mounts for both type seem not evenly distributed across LIFs/nodes. Each node bears 1,200-1,600 mounts. 

I have also noticed there are packets discards on some ports within ifgrp group(4x10GbE or 4x40GbE). Some of ports on some nodes have reached as high as 1.x or 2.x%. 0.1% or below could be treated as acceptable, as my understanding.

Should "packets discards" be used as the measurement to determine if LIFs/node have been overly used for NFS mounts, and if there is the need to redistribute NFS mounts? are there any other measurements?

 

Should we increase ifgrpfrom 4x10GbE to 4x40GbE for those ports have 10GbE, and packets discards?

 

parisi

Recommendation is to spread mounts across multiple nodes/LIFs in the cluster. While you can establish up to 100,000 NAS connections on a single node, there are other resource limits you might hit, such as memory, CPU or exec contexts.

 

TR-4067 covers it in networking considerations.

@parisi , 

Thanks!

 

NFS datastores (layer-2) or volumes(layer-3) are heavily used here. NFS mounts for both type seem not evenly distributed across LIFs/nodes. Each node bears 1,200-1,600 mounts. 

I have also noticed there are packets discards on some ports within ifgrp group(4x10GbE or 4x40GbE). Some of ports on some nodes have reached as high as 1.x or 2.x%. 0.1% or below could be treated as acceptable, as my understanding.

Should "packets discards" be used as the measurement to determine if LIFs/node have been overly used for NFS mounts, and if there is the need to redistribute NFS mounts? are there any other measurements?

 

Should we increase ifgrpfrom 4x10GbE to 4x40GbE for those ports have 10GbE, and packets discards?

Packet discards aren't generally caused by "too many NFS mounts."

 

ONTAP won't discard packets if it's overloaded; instead it uses flow control mechanisms to tell the clients to wait until resources are freed. Packet discards are usually network related issues - bad cables, ifgrp config issues, etc. 

 

I'd suggest opening up a support case to try to narrow down why those are happening.

 

What issue prompted you to look at packet discards? Where are you seeing those accumulate? Ifstat on the cluster? Packet traces?

We some time have experienced performance issues on some batch jobs running on Linux VMs, Latency is on the datastore or volume only 1-2 ms,  We couldn't identify the root cause. That made me to look down the ifgrp and ports, and find packets discards.

 

I run "node run -node node-name ifstat port-name", and found the  "discards/total frame" is higher than expected. As you point out that too many NFS mounts won't cause discards, then what if I increase the bandwidth from 4x10GbE to 4x40GbE, would that improve ? because one of HA pair has 4x40GbE and the discards rate is very low, < 0.1%. 

 

 

I'd suggest having a look at the section in TR-4067 on RPC slot tables/exec context blocking starting on page 111.

 

https://www.netapp.com/us/media/tr-4067.pdf

 

See if your cluster nodes are getting excess blocked execs as per page 112. That may be the source of your latency. Remediation is configuring the clients' slot tables lower or to use nconnect (if your client OS supports it). That would be done on the NFS VMs.

 

As for discards, you could try to chase that rabbit if you want, and it likely does need to be resolved, but I don't know if it solves your latency problem.

I have read page  starting from pg111, there are obviously "exec context blocking" on some storage nodes, because at least 10k or many more increases during 1 minute period of time. We are using 9.7p11 and most of Linux vm servers are running redhat 7.9, all using default RPC slot tables. 

As I said, previously, each storage nodes has 1200-1600 NFS mounts, so, my question is:
1.  Should reduce workloads (NFS mounts) from  nodes having more exec context blockings by moving workloads to other nodes? Thus to redistribute loads among nodes?

2.  for the server in question, because we are all using default RCP slot tables, then there would be no way to identify if RPC slot tables/exec context blocking  is the issue here. What can I do in this situation?

 

Thank you for your valuable message!

  

ONTAP 9.8 and later has an EMS message that tells you which client is overrunning the exec contexts. ONTAP 9.9.1 and later introduced exec context throttling for some node types.

 

Spreading the workload across nodes may help, but only if a single node’s resources are being overrun. If the resource issue is constrained to single TCP connections, spreading the workload won’t necessarily help there - only reducing the client driven slot tables or using nconnect.

I have some follow-up below if you can please help me out:

 

1.  Based on the large number of increases on execs_blocked_on_cid as the result of running that "statistics" command(> 10k in a minute), It looks that OnTap pushes back clients requests  on some nodes in the cluster, can we then conclude that these nodes have performance issues and caused by too many concurrent NFS operation requests?

 

2. In your document, you used the workload "creating lots of files or directories" as examples to illustrate concepts, what if in the case of Oracle Database files? Could the workload like this also cause large amount of NFS requests at the same time or cause overrunning exec contexts? I can see there are about 300 dbf Oracle datafiles on a NFS File System, and based on the time stamp, they are kept changing across all these files from time to time(I know they are not OLTP type applications). To accurately tell if this client really overrun exec contexts, we will have to upgrade OnTap to 9.8 first. Correct?

 

3. Would that be okay to configure nconnect without knowing if the client really  has overrunning exec contexts? 

 

4.  On page 113,  you stated that 3ms of  latency was added,  but Figure 19) showed ~9ms latency. Why?

5.  Will overrunning issue also apply to NFS datastore mounted on ESXi host as the client?

 

 

1. No, we can't make that conclusion. The exec blocking can happen due to TCP resource exhaustion (slot tables per TCP connection) *or* node level exhaustion. It is not clear here which one is causing the issue. ONTAP 9.8 and/or 9.9.1 could help some there.

 

2. We do see Oracle databases causing slot table exhaustion on occasion, which is why it's covered in the Oracle Best Practice guide on page 36. dNFS can also help there.

 

https://www.netapp.com/pdf.html?item=/media/8744-tr3633pdf.pdf

 

3. nconnect won't hurt you (other than using up more TCP connections - nodes have 100K limits per node); the issue is if the client OS supports it. RHEL 7.9 does not that I am aware, but RHEL 8.3 and later does.

 

4. 65536 slot figure had 9ms latency. 128 slot figure had 6ms latency. 9ms - 6ms = 3ms

 

5. ESXi can also overrun the slot tables, yes. But again, we don't have a clear picture on which clients are possibly overrunning the TCP connections. ONTAP 9.8 has an EMS that gives more detail. For example (also seen on page 113):

 

cluster::*> event log show -node tme-a300-efs01-0* -message-name nblade.execsOverLimit
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
4/8/2021 17:01:30 node1
ERROR nblade.execsOverLimit: The number of in-flight requests from client with source IP x.x.x.x to destination LIF x.x.x.x (Vserver 20) is greater than the maximum number of in-flight requests allowed (128). The client might see degraded performance due to request throttling.

Appreciate your help!

 

1.  Each NFS mount is corresponding a CID, and for each CID, OnTap can allow maximum of 128 execs. If I have 2 NFS mounts on a client, there would be then 2x128 = 256 execs OnTap can allow. If nconnect=0, then one TCP connections per mount. If I have 2 NFS mounts then there would be 2 TCP connections. OnTap can only supports 128 slot tables per TCP connection. Are these statements all correct?

 

2.  For clients, 65535 as their default slot tables, should we change it to 128 as the maximum value, because too many slot tables may exhaust "exec_contexts" on nodes from these clients. Make sense?

 

4.  A node can have 100K TCP connection, and each TCP connection can allow 128 exec_contexts, then the node can allow 12,800K exec_contexts. Correct?

 

5.  OnTap enables NAS flow control, then for the cluster mainly uses NAS protocol, we don't need to warry about if we should disable or  enable flow control on NIC/Switch.  Correct?

 

 

 

1. Correct. Plus, there are per-node exec context limits, so if you have 1000 mounts and each uses 128 execs per CID, you could potentially run out of execs per node. TR-4067 has a section on how to see the max execs per node allowed and to see if you're approaching that limit on page 109.

 

2. Yes, that would help reduce the overruns. However, I have seen in large environments with lots of clients where the value had to be lowered even further (16 in some cases) due to the per-node limits. ONTAP 9.9.1 has exec context throttling (page 109) that helps mitigate that issue.

 

3. No. See page 109 of the TR - different node types have different exec limits depending on RAM. And execs aren't always allocated; once the operation is done, ONTAP releases the exec back to the system for a new operation.

 

4. NAS flow control is specific to CIFS/NFS operations. Switch flow control is for all ethernet operations. They are not really related to one another, so it's ultimately your choice if you want to enable/disable it on the switch. 

Announcements
Register for Insight 2021 Digital

INSIGHT 2021 Digital: Meet the Specialists 2

On October 20-22, gear up for a fully digital, totally immersive virtual experience with a downright legendary lineup of world-renowned specialists. Tune in for visionary conversations, solution deep dives, technical sessions and more.

NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public