ONTAP Discussions
ONTAP Discussions
Hi,
during the RC stage of Ontap 9.12.1 it was mentioned that it finally supported NFS 4.1 multipathing. However, now that 9.12.1 has been released I can’t find NFS 4.1 multipathing support in the release notes.
Am I missing something?
Solved! See The Solution
Please file a doc feedback. https://docs.netapp.com/us-en/ontap-cli-9121//vserver-nfs-modify.html#parameters it's here in 9.12. Not sure why it's not in the release notes.
@FrankWestit is OFFICIALLY in the ONTAP 9.14.1 Release notes https://library.netapp.com/ecm/ecm_download_file/ECMLP2492508
Also here is a KB article to explain https://kb.netapp.com/onprem/ontap/da/NAS/Does_ONTAP_support_NFSv4.1_Session_Trunking
Ok, thanks for the update!
👍
Now we need NFS Session Trunking across controllers....... ONTAP X?
So the feature GA in 9.14RC1 is only for session trunking on one controller?
Than, it is useless for production...
@ADVUNIBN1, yes AFAIK. There is no sharing of nblade across controllers in ONTAP
I can't comment on roadmap, but usually when we release features we do it in staggered phases. I would expect this to be the same. If you want to have this prioritized (which I would logically expect it to be), please reach out to your account representative from NetApp Sales.
Hi Paul and others reading,
From my perspective, NFS Session Trunking across controllers would require a much larger architectural change within ONTAP that would take such a large investment from NetApp that talking to an account representative would be futile unless it came with a $200M PO that is dependent on this feature.
That being said, maybe nblade sharing across controllers is in the roadmap.
moved
@FrankWest depending on your use case you may very well still want to keep iSCSI MPIO as you can have it configured will a full mesh across the NetApp controllers. I am sure there will be a limitation with NFS multipathing where it can only multipath across ports within the same controller (single node) reference https://whyistheinternetbroken.wordpress.com/2022/11/11/behind-the-scenes-episode-347-netapp-ontap-9-12-1-overview/
00;32;38;21 – 00;32;46;08
00;32;46;16 – 00;32;49;28 |
So if you are trying to look at NFS multipathing for fast NFS failover between controllers I am not sure this solution will meet your requirements.
Maybe @ChanceBingen can shed some more light on the operation.
In the NFS v4.1 protocol, we are implementing session trunking as defined in RFC 8881 . The RFC also defines client ID trunking, which neither we nor VMware are supporting at this time. Here, we have the concept of a tuple for each connection of server scope, server major owner, and server minor owner.
In order to deliver the fastest performance possible, we define the node as the server major owner, which is telling the session trunking client which connections can be trunked together to form a trunk group.
By keeping the session node scoped, we can minimize the performance overhead of replicating some of the statefulness nature of the NFS v4.1 protocol across the cluster.
In my personal lab testing, I can say that it is extremely fast in IO handling, although I haven't tried to put a stopwatch on failover times.
1. If sessions trunking is only on a single node although through multiple LIF's, when sessions or datastores are failed over to the other node, will they all still be alive and continue to work?
2. In this case, should we or if it is worth of implementing it in the NFS & VMware environment?
1) Yes sessions and datastores will be still alive and continue to work when LIFs are failed over to the alternate controller. The issue is that you do not get path resiliency like iSCSI when controllers fail over. This would require NFS Session Trunking to be across controllers.
2) Yes it is worth implementing because if you have a full mesh network scenario where you have controllers connected to a HA TOR switch cluster(e.g vPC, SMLT, MCLAG, MLAG) your network team can upgrade switches without impacting the NFS storage. So half of your problems are resolved.
1. Datastore or sessions will continue to work, but without resilience. Could you please explain to me in more details?
2. I don’t quite understand. But, it sounds only Networking team can benefit from implementing, and no so much for the storage?
1) If you have a LIF migrate to another controller there is an IO pause, which is not the same for iSCSI MPIO (as block LIFs dont move anyway). While I have not tested NFS Session Trunking on NetApp, if there is a port failure or down event on a controller with NFS Session Trunking you should not see an IO pause to the LIF on that controller.
2) Well it benefits the users of the NFS storage, a network change will not impact the access to the storage, which means users are not affected and happy storage admin.
Excuse for my persistence.
1. When we perform ONTAP upgrade, we didn't experience any NFS pause. We are using NFSv3 though. When you say there is an IO pause, are you referring to NFSv3 or v4?
2. For the switch redundancy, it won't have impact on the storage. We have not impacted by any switch maintenance.
So, I don't see much real benefits for implementing it.
No problems, it is best to ask questions rather than assume.
1) I am referring to NFS4.1 as NFS Session Trunking for ONTAP is NFS4.1
2) You are best to read the NetApp docs about NFS 4.1 Session Trunking https://docs.netapp.com/us-en/ontap/nfs-trunking/ resiliency is one aspect, but Performance is another.
3) What about nconnect? Read the docs in the link above.
Yes, I am sure of session trunking is great, resilience and performance.
But, I am just not sure of if it is worth of all efforts to implement for NFS vmware datastores when it works on single node only not across all nodes. How much performance improvement we can have from single node?
We have a large NFS vmware environment, consists of > 2000 VM's and 8 storage nodes. To utilize session trunking on single node, we need to add multiple additional LIF's on every node, since we currently only have one LIF per node and per SVM. Also, the uncertainty on how it is going to work out for a Production environment.