Re: 7-mode to C-mode TCP timeout

Franky_GT · ‎2016-10-21

OK, We've found some changed behaviour between 7-mode and C-mode

In 7-mode filers the following is implemented :

rpcmod:svc_idle_timeout Description

Controls the duration of time on the server that a connection between the client and server is allowed to remain idle before being closed.

Data Type

Long integer (32 bits on 32–bit platforms and 64 bits on 64–bit platforms)

Default

360,000 milliseconds (6 minutes)

This means that NFS sessions over TCP will close after 6 minutes if no traffic is seen. When this happens the session is gracefully shutdown by starting FIN handshake. The client will automatically start a new session once the NFS mount is used again.

In C-mode this doesn't happen! In C-mode there is a keepalive mechanism which will poll the client after 2 hours idle, and if it doesn't receive an answer it will remove the session from its session table.

How can we get the old behaviour back? We need it if we use firewalls between clients and netapp.

Thanks in advance!

Frank

hariprak · ‎2016-10-25

Hi,

Hope this KB article helps https://kb.netapp.com/support/s/article/troubleshooting-workflow-specified-network-resource-or-device-is-no-longer-available

Thanks

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

Franky_GT · ‎2016-10-27

Unfortunately this is unrelated.

We actually also have a support call open for 2 months now, and they point me to parameter "rpcsec-ctx-idle", which is also unrelated. It can't be the case that nobody within netapp knows what is implemented in the kernel!

The code I'm talkin about is implemented in the linux kernel by an ex netapp employee, as seen here : https://oss.oracle.com/~cel/linux-2.6/2.6.11/36-xprt-timeouts.patch

In solaris you can find the code here : http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/rpcmod.c

You can just test yourself if this is implemented by sniffing an NFS over TCP session, and hard powerdown the client. (normally the client will shutdown the connection after 5 min idle time)

Then after 6 minutes you will see a FIN packet being sent from the netapp filer.

On C-mode this doen't occur; after 2 hours it send keepalive messages.

Gr, Frank

parisi · ‎2017-02-22

I believe you may be looking for the following advanced priv options:

v3-connection-drop

enable-ejukebox

This is how the options should ideally behave:

1) -ejukebox-enable true, -v3-connection-drop true
2) -ejukebox-enable true, -v3-connection-drop false

These two will exhibit the same behavior when rewinds are exhausted. The filer will return EJUKEBOX.

3) -ejukebox-enable false, -v3-connection-drop true

With this set of options, the filer will drop the response and close the connection. Clients will reconnect after a few seconds and retry the request.

4) -ejukebox-enable true, -v3-connection-drop false

With this set of options, the filer will drop the response, but will not drop the connection. This will cause the clients to timeout on the RPC request and retransmit. Some times this will take a few minutes, depending on the client.

Option #3 or #4 sounds like what you might be looking for, but you could always test out each combination to see what you prefer.