ONTAP Discussions

Namestring collision occurs between NFS clients.

Tzach
3,760 Views

Hello, 

 

A NetApp running  9.0  has a svm publishing an NFS share (v3 v4 v4.1 enabled),

On an automation CI job we mount the nfs share on a VM running ontop of a physical server,

both VM and server are RHEL based, for the most part automation and share mounting works fine. 

 

However if we run multiple automation jobs simultaneously, 

only one job manages to mount the share on the VM,  other job(s) fail to mount the NFS share. 

 

The root cause of this happen as VMs use the same name (controller-2), they may simultaneously run on other servers/IPs,

thus netapp get's annoyed and reports this understandable error:

Node: node-a
Time: Tue, Jun 22 15:54:49 2021 +0300
Severity: ALERT
Message: Nblade.NewClientIdMismatch: NFSv4 name string "Linux NFSv4.1 controller-2" collision between clients 10.46.w.y and 10.35.x.z.
Description: This message occurs when a namestring collision occurs between NFS clients.
 
The VM on host 10.46.w.y fails to mount the NFS with this error:
mount.nfs: timeout set for Thu Oct  7 05:15:52 2021
mount.nfs: trying text-based options 'vers=4.2,addr=10.46.w.y,clientaddr=172.16.0.45'
mount.nfs: mount(2): Protocol not supported
mount.nfs: trying text-based options 'vers=4,minorversion=1,addr=10.46.w.y,clientaddr=172.16.0.45'
mount.nfs: mount(2): Operation not permitted
mount.nfs: trying text-based options 'addr=10.46.w.y'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.46.w.y prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 10.46.w.y prog 100005 vers 3 prot UDP port 635
mount.nfs: mount(2): Permission denied
mount.nfs: Operation not permitted
 
  1. If I unmount the NFS share on VM  running on host 10.35.x.z, then re-mount same share on VM running on 10.46.w.y this time it mounts fine, while the first VM fails to mount with same error. 
  2. I can't change the VM names, I'm bound to have duplicate names. 
  3. automation jobs run randomly in parallel, I can't schedule them to run serially.
  4. I have to use NFSv4 or above.
  5. Each vm uses  random file names on the share, they'll never overwrite each others data.
 
Is there anything I can do to bypass this duplicate name collision? 
The only solution I found thus far is to add additional multiple IP addresses under the SVM/NFS,
each job mounts use's a unique IP from the pool of address assigned to the SVM/NFS.
Thus the job(s) are now able to simultaneously mount the shares from two(or more) VMs running on other hosts. 
 
The only drawback with this solution,  it complicates my automation ci job configuration,
as I have to assign/manage unique IP per each job. 
 
My question is simple, is there any netapp setting/config I could change so that it would ignore client collision? 
And I could revert back my automation jobs to a single IP address?
 
Thanks 
 
 
 
1 ACCEPTED SOLUTION

SeanLuce
3,677 Views

Check out the following documentation: https://docs.microsoft.com/en-us/azure/azure-netapp-files/configure-nfs-clients#configure-two-vms-with-the-same-hostname-to-access-nfsv41-volumes

 

It is Azure NetApp Files documentation, but the same solution applies. It is a client side unique identifier that needs to be set.

View solution in original post

3 REPLIES 3

SeanLuce
3,678 Views

Check out the following documentation: https://docs.microsoft.com/en-us/azure/azure-netapp-files/configure-nfs-clients#configure-two-vms-with-the-same-hostname-to-access-nfsv41-volumes

 

It is Azure NetApp Files documentation, but the same solution applies. It is a client side unique identifier that needs to be set.

Tzach
3,622 Views

You rock SeanLuce!

  

Didn't test it out (yet), sounds like it's another possible solution.  

However I suspect setting unique IP per each automation job/share mount may be easier to implement, than having to automate generation of nfs4_unique_id per VM .  

 

Anyway thank you very much for the tip!

parisi
3,602 Views

This is also covered in this TR:

https://www.netapp.com/pdf.html?item=/media/10720-tr-4067.pdf page 124

Public