Network and Storage Protocols

server NFS not responding, still trying

zizibagnon
3,336 Views

 

hello,

 

Please help me. i use CDOT ONTAP and since some day i have many error message on my Clients Red Hat.

Do you have facing to this issue ? if yes, how are you correct this ?

 

[Sat Apr 9 22:24:28 2022] nfs: server NAS1 not responding, still trying
[Sat Apr 9 22:24:30 2022] nfs: server NAS1 not responding, still trying
[Sat Apr 9 22:24:35 2022] nfs: server NAS1 not responding, still trying
[Sat Apr 9 22:24:42 2022] nfs: server NAS1 not responding, still trying
[Sat Apr 9 22:24:46 2022] nfs: server NAS1 not responding, still trying
[Sat Apr 9 22:24:55 2022] nfs: server NAS1 OK
[Sat Apr 9 22:24:55 2022] nfs: server NAS1 OK
[Sat Apr 9 22:24:55 2022] nfs: server NAS1 OK
[Sat Apr 9 22:24:55 2022] nfs: server NAS1 OK
[Sat Apr 9 22:24:55 2022] nfs: server NAS1 OK
huile

7 REPLIES 7

DarrenJ
3,288 Views

What version of RedHat? This KB might apply to you.

 

https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Intermittently_receiving_%22NFS_not_responding%22_errors_on_the_unix_c...

 

Does the event log contain any messages relating to NFS or anything around that time period you observed the issue? 

 

Run and share the below output from cluster shell

 

>> event log show -message-name *nfs*

NetApp_SR
3,287 Views

There are some best practices in the guide below that may help your system to run better.

 

NFS in NetApp ONTAP

https://www.netapp.com/pdf.html?item=/media/10720-tr-4067.pdf

zizibagnon
3,275 Views

Thank you.

But Do you think creating multiple LIFs for an SVM will solve the problem?

zizibagnon
3,276 Views

Hello Darrenj,

 

i use RHEL 7.3

 

what should I understand by these values? (see below)

ALLCA::*> systemshell -node * sysctl sysvar.nblade | grep -i cid
(system node systemshell)

Node: ALLNA1A
sysvar.nblade.debug.core.cid_in_use: 226
sysvar.nblade.debug.core.cid_max: 115911
sysvar.nblade.debug.core.cid_reserved: 10526
sysvar.nblade.debug.core.cid_allocs: 33452
sysvar.nblade.debug.core.total_execs_blocked_on_per_cid_limit: 6858371
sysvar.nblade.ngprocess.rewind.PerCIDRewindContextCount: 11578

Node: ALLNA1B
sysvar.nblade.debug.core.cid_in_use: 230
sysvar.nblade.debug.core.cid_max: 115911
sysvar.nblade.debug.core.cid_reserved: 10526
sysvar.nblade.debug.core.cid_allocs: 101805
sysvar.nblade.debug.core.total_execs_blocked_on_per_cid_limit: 2844707
sysvar.nblade.ngprocess.rewind.PerCIDRewindContextCount: 11578
2 entries were acted on.

 

 

ALLCA::> event log show -message-name *nfs*
There are no entries matching your query.

DarrenJ
3,263 Views

If you have a support contract I would suggest opening a case for this. 

 

What prompted you to check those values? Looks like those are allocations within nblade, and you aren't hitting any limits from what I can see. 

 

Do you have any more context on the issue itself? Multiple users? Multiple shares? Intermittent or constant? Does it self resolve on its own? Any detectable pattern? What NFS version? 

 

Also I linked the wrong KB in my last response. Meant to do this one, which you might be affected by. See the RedHat KB as well.

 

https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/NFS_Server_server-name_not_responding%2C_still_trying_in_RHEL_7.6

https://access.redhat.com/solutions/3765711

Addings LIFs will likely not help since you're only mounting to one single LIF/IP. 

zizibagnon
3,175 Views

Dear,

 

the problem resolves itself after about 1 min.
i am using nfs3
We have about twenty client servers that use the only LIF
Look at this value, we have sessions blocked, hence the importance of creating other LIFs
sysvar.nblade.debug.core.total_execs_blocked_on_per_cid_limit: 6858371

hmoubara
3,078 Views
Public