NFS hung mounts

kodiak_f · ‎2021-01-08

Hi Folks,

Apologies if this isn't the right venue since this is probably more about OS admin than NetApp specifically, but we are using NetApp ONTAP for NFS NAS so I figured I'd try because the quality of response here has been great.

We've just had a minor planned network "blip" (5s-15s) that couldn't be avoided due to where the change was happening on the network core. I didn't think it would be a big deal but we still ended up having a ton of hung NFSv3 hard mounts that couldn't be resolved without rebooting the affected Linux systems.

The systems ranged from Debian 8 / CentoOS6 to Ubuntu 20.04. Mounts have been defined in fstab with varying options, from simply 'defaults', to '_netdev,nofail'. Some hosts weren't affected at all.

It's 2021 and to me it seems wild that a modern Linux OS can't gracefully recover from a 5-15s outage. Am I missing some tricks for getting more graceful recovery on these systems? Was there something I could do from the client end to get the mount to drop completely and remount?

Longer term I'm hoping to move clients over to a new SVM that enforces NFSv4.1+, but that's still a way off and I'll have to contend with NFSv3 for some time in this infra. Then again I'm not sure that changing the protocol version will help much - I'm not an NFS guru.

Thanks everyone!

jcolonfzenpr · ‎2021-01-09

Hi,

you can try retrans mount point option.

There are also other option you can test:

https://linux.die.net/man/5/nfs

good luck!

Jonathan Colón | Blog | Linkedin