Tech ONTAP Blogs

How to Configure NVMe/TCP with vSphere 8.0 Update 1 and ONTAP 9.13.1 for VMFS Datastores


How to Configure NVMe/TCP with vSphere 8.0 Update 1 and ONTAP 9.13.1 for VMFS Datastores



Now that ONTAP 9.13.1RC1 and vSphere 8.0 update 1 have been out for a month or two, I felt it was time to do a deep dive on configuring NVMe-oF (Non-Volatile Memory Express over Fabrics) for VMware vSphere datastores. NVMe-oF is a generic term used to describe NVMe connected across a network. As you may already be aware, NVMe is the next generation data protocol designed from the ground up for all flash storage, and NetApp is the current chair of the NVMe Fabric and Multi-Domain Subsystem Task Group.


In this blog post, I want to specifically focus on NVMe/TCP (NVMe-oF over TCP/IP networks), since there isn’t a lot of content out there about it yet. And I’ll also be focusing on VMFS. NVMe-oF and vVols will be covered in a separate post.


In many ways, using NVMe/TCP with vSphere is very similar to using iSCSI. For example, you enable software-based initiators, just like with iSCSI. And you configure your array connections in the software initiator, similar to iSCSI. Likewise, all the same best practices around VMkernel port allocation, IP subnetting, VLAN configurations, and network setup are all identical between iSCSI and NVMe/TCP. In fact, if you take any network diagram or solution guide based on using vSphere with iSCSI, you can use the exact same references for NVMe/TCP.


However, once you dig a little deeper, there are some significant differences. Not just in the fact that NVMe-based protocols and SCSI-based protocols are different (NVMe-based protocols are significantly more efficient), but also in the configuration steps. Hence this blog post.


What’s new

As you may already know, ONTAP 9.12.1 added support for secure authentication over NVMe/TCP as well as increasing NVMe limits (viewable on the NetApp Hardware Universe [HWU]). ONTAP 9.13.1 is the first non-patch release for FAS and AFF systems to support VMware’s maximum size for VMFS datastores at 64TB on SAN protocols (both SCSI and NVMe based) as well as maximum size vVols (NFS, SCSI, and NVMe) and RDMs (SCSI only). Of course, you can also connect even larger LUNs and namespaces to in-guest initiators based on whatever limits those operating systems impose. Up to 128TB for LUNs and namespaces and up to 300TB for the underlying FlexVol volumes. NetApp’s ASA systems have supported these larger limits for some time now.


For customers using software defined ONTAP, starting in ONTAP 9.13.1, ONTAP Select now officially supports NVMe/TCP as a data protocol. Cloud Volumes ONTAP already supports NVMe/TCP in the cloud. Amazon FSx for NetApp ONTAP users who are interested in NVMe/TCP should watch for the upcoming release of ONTAP 9.13.1.


Having said all that, what’s truly special with this combination is that with vSphere 8.0 update 1, VMware has completed their journey to a completely native end-to-end NVMe storage stack. Prior to 8.0U1, there was a SCSI translation layer which added some complexity to the stack and slightly decreased some of the efficiencies inherent in the NVMe protocol. Also new in update 1 are improvements in scalability limits that make it much more feasible to use in large environments. You can read more about these and other improvements in this blog post by Jason Massae @jbmassae.


Configuration steps

The following can be used as a step-by-step guide to configure VMFS datastores over NVMe/TCP. I will, however, try to avoid duplication where possible. An overview of the process in the official ONTAP docs can be found here. You can also refer to the official NetApp NVMe-oF Host Configuration for ESXi 8.x with ONTAP guide here.


At a high level, the process is fundamentally like this:

  1. Create or configure an SVM.
  2. Create or configure LIFs.
  3. Enable VMkernel port(s) for NVMe/TCP.
  4. Add NVMe/TCP software adapter(s).
  5. Add an NVMe subsystem.
  6. Add NVMe controller(s).
  7. Create NVMe namespace(s).
  8. Create datastore(s).
  9. Modify any multipathing settings if desired.

As I said, I will try to avoid duplication where possible. That means, first, if you haven’t already done so, you need to create a SVM (storage virtual machine) configured for NVMe/TCP. Instructions for doing so can be found here.


Likewise, that also means if you haven’t already done so, you will need to create LIFs (logical interfaces) for use with NVMe/TCP. Simply log into system manager and go to network -> overview and click add as shown below. Additional CLI instructions can be found here.




Then leave Data selected and choose NVMe/TCP as the protocol. Specify the SVM you created previously and fill in the rest of the networking details. Repeat this step as desired, but you should have at least two per node for every HA pair where namespaces will exist.




Now the fun begins. You can refer to VMware’s documentation here for further details of what we are about to do.

First, you must enable NVMe/TCP on the desired VMkernel adapters. Edit the VMkernel adapter to be used by going to your host’s configure tab and selecting the adapter listed under Networking\VMkernel adapters and clicking the edit action. You may also create a new VMkernel adapter.






NOTE: Be aware of which uplinks are available to the selected VMkernel adapter’s port group, as you will need to reference them again in the next step.




Next you will enable the NVMe/TCP storage adapter. While still on the configure tab, click on Storage\Storage Adapters.

Click ADD SOFTWARE ADAPTER and then click on Add NVMe over TCP adapter as shown below.




Select a vmnic that is available to the previously edited VMkernel adapter.





Repeat as necessary.


Next, we’ll add the NVMe controller for the array in vSphere and subsystem in ONTAP. To do that, we’ll first select the adapter in the upper right pane, then we’ll click on the Controllers tab and select the ADD CONTROLLER button.





Click the copy button next to the NQN (NVMe Qualified Name)




Switch back over to NetApp System Manager and expand HOSTS then select NVMe Subsystems






Click the add button, provide a friendly but unique name for the subsystem, change the OS to VMware, and paste in the NQN from the host.





Back in the controller menu, add the IP address of one of the NVMe/TCP LIFs you created, enter in port 8009, and click discover.

Once discovery is complete, check the checkbox for each discovered LIF you want to use over this adapter. Keep in mind the vmnic that this adapter is bound to. Then click OK.




Repeat as necessary if you have additional software adapters.

Now let’s create an NVMe namespace. Back in System Manager, expand Storage and select NVMe Namespaces, then click Add.




 After that you simply supply the information for one or more namespaces you want to create, and System Manager will create them for you. Note that you must change the OS to VMware, and select the subsystem we’ve created, as shown below. Click Save when done.





You will notice that the device shows up under the adapter in vSphere without having to rescan as you might have to do with iSCSI.




Next, we’ll put a datastore on the device. Right click on the host and select storage -> New Datastore.




Select VMFS and click Next.




Provide a name and select the NVMe namespace we just created and click Next.




Select VMFS6 and click Next.




Click Next again to use the entire NS.




Review your inputs and select Finish.




Next review the results on the summary tab of the datastore.




Next review the multipathing configuration and adjust as needed.




 Click ACTIONS and Edit Multipathing to make changes.




I typically recommend changing the IOPs to 1. If this were a SCSI LUN on a NetApp ASA system, I would also recommend setting the PSP (Path Selection Policy) to LB-Latency as shown below. This is NVMe and not SCSI, so I won’t change the PSP.




And that’s it. You can now start deploying VMs and testing the solution!