Tech ONTAP Blogs

Introducing NVMe/TCP Support in Google Cloud NetApp Volumes

sajith
NetApp
23 Views

Overview

 

We're excited to announce that Google Cloud NetApp Volumes now supports NVMe/TCP through the new Flex Unified service level on ONTAP-mode storage pools. This marks a significant milestone for customers running high-performance block storage workloads on Google Cloud, combining the operational simplicity of a fully managed NetApp service with the low-latency, highly parallel I/O capabilities of NVMe over TCP.

With Flex Unified, you can now serve NFS, SMB, iSCSI, and NVMe/TCP from a single pool, share block volumes across multiple hosts without performance degradation, and tap directly into the full ONTAP control plane via Google Cloud APIs and the gcloud CLI — authenticated with your Google Cloud IAM identity, with no separate management endpoints or credentials to manage.

 

📘 New to ONTAP-mode? We recommend starting with Introducing ONTAP-mode for NetApp Volumes on the NetApp Community. It covers the underlying architecture and explains why ONTAP-mode is the foundation that makes the capabilities described here possible.

 

Why NVMe/TCP

 

1. Built for Parallelism (Multi-core Optimized)

Modern workloads — distributed databases, large-scale virtualization, AI/ML pipelines — are no longer bottlenecked by raw bandwidth. They're bottlenecked by I/O parallelism and tail latency.

iSCSI was designed in the SCSI era. It carries one command queue per session. A single LUN, no matter how powerful the underlying storage, becomes a serialization point for every thread on every core trying to do I/O.

NVMe/TCP was built for the multi-core era: many parallel I/O queues, each operating independently without a global lock.

With Google Cloud NetApp Volumes, I/O queue count and queue depth are configurable at the subsystem and host level, with defaults tuned by ONTAP based on the backend system. ONTAP additionally supports a per-host QoS priority — regular or high — that determines how many queue slots and how much depth each host is granted. A latency-sensitive database can be assigned higher priority and given deeper, more parallel queues than other hosts sharing the same subsystem (see NVMe Host QoS on the NetApp Community for details).

The deeper the workload's concurrency, the larger the gap between NVMe/TCP and iSCSI becomes. For a database VM running hundreds of concurrent worker threads, this isn't a tuning improvement — it's a protocol fundamentally suited to the workload, with controls that let you guarantee resources to the hosts that need them most.

 

2. Runs on Standard Ethernet (No Specialized Network Required)

For decades, enterprise SAN meant dedicated infrastructure: Fibre Channel HBAs, FC switches, FCoE fabrics, or RDMA-capable NICs on lossless Ethernet. Specialized hardware, specialized skills, specialized cost.

NVMe/TCP runs on the network you already have.

  • No FC, no FCoE, no RoCE, no InfiniBand
  • No dedicated storage NICs or specialized switches
  • Standard Google Cloud VPC, standard subnets, standard firewall rules
  • Standard observability — VPC Flow Logs, Network Intelligence Center, your existing monitoring

This is more than convenience. It collapses an entire tier of cost and operational complexity:

  • For greenfield cloud deployments: No "storage network" to design, build, or operate separately from the rest of your VPC.
  • For lift-and-shift migrations from on-prem SAN: Keep your storage protocol (block), keep your data services (ONTAP), drop the fabric. Cloud-native networking carries the I/O.
  • For cost-sensitive workloads that previously couldn't justify a dedicated SAN: Get SAN-class capabilities at standard VPC economics.
  • For platform and DevOps teams: No new networking skills. Whoever runs your VPC already has everything they need to run your block storage network.

The same Ethernet that carries your application traffic now carries your storage traffic — securely isolated by VPC, IAM, and firewall policy, but operationally unified.

 

3. Enterprise-Grade Shared Block Storage at Scale

This is the single biggest differentiator of NVMe/TCP on Google Cloud NetApp Volumes:

A single NVMe/TCP volume can be shared across multiple Linux hosts simultaneously — with no performance penalty for sharing.

Most cloud block storage forces a hard trade-off. Either a disk is attached to a single VM with full performance, or it can be multi-attached with restrictions — limited read-write semantics, capped throughput, reduced features, or all of the above. Sharing block storage in the cloud has historically meant giving something up.

Google Cloud NetApp Volumes delivers what enterprise SAN admins have always expected: a shared block tier where every attached host gets the full performance of the volume, every time.

That unlocks workloads that simply don't work well on single-attach block storage:

  • VMware on Google Cloud: Shared VMFS datastores across every ESXi host in the cluster, enabling vMotion, HA, and DRS the way they're designed to work.
  • Database clusters with shared storage: SQL Server Failover Cluster Instances and similar architectures that require multiple nodes to see the same block device.
  • Kubernetes and container platforms: ReadWriteMany block volumes for stateful workloads, shared scratch space, and high-performance CI/CD artifacts.
  • High-availability application failover: When the active node fails, the standby already sees the LUN — no detach, no re-attach, no remount window.
  • Multi-reader analytics: Multiple compute instances scanning the same dataset in parallel without copying it.

Shared block, full performance, full data services. That's the enterprise SAN model — delivered as a managed Google Cloud service.

Choose NVMe/TCP on Flex Unified when you need any of the following: shared block access across multiple hosts at full performance, enterprise SAN data services (snapshots, clones, replication, efficiencies) applied to block, a single platform that serves both file and block, or a clean migration path from an existing on-prem ONTAP SAN.

 

A Real Workload: Why NVMe/TCP Matters for Databases

 

Take a real example — a production OLTP database serving thousands of concurrent transactions per second on a 32-vCPU Google Cloud VM.

The I/O profile:

  • Data files: random 8 KB reads and writes, deep concurrency across many worker threads
  • Transaction log: small, sequential, latency-critical synchronous writes — every commit waits on the log write to durable storage
  • Background tasks: checkpoints, backups, index rebuilds — bursty, large-block, parallel

On iSCSI: With a single command queue per session, every thread on every core funnels its I/O through one serialization point. At low concurrency, performance is fine. As the database scales up — more sessions, more parallel queries, larger checkpoints — queue contention becomes the bottleneck. P99 commit latency degrades faster than throughput grows, because the log writes (the most latency-sensitive operations) get stuck behind queued data I/O.

 

On NVMe/TCP: The database host is granted many parallel I/O queues, with queue count and depth provisioned by ONTAP. By assigning the database host a higher QoS priority, you can guarantee it deeper queues and more parallelism than less-critical hosts sharing the same subsystem. Concurrent log writes and data-file I/O no longer serialize behind each other. The result, on the same VM and same network, is typically:

  • Higher sustained TPS at high concurrency
  • Lower commit latency, including at the P99 tail
  • More predictable performance under bursty load (checkpoints, backups)

Add NetApp Volumes data services on top, and the workload picture shifts further:

  • Consistency Group snapshots capture data and log volumes atomically — application-consistent, recoverable, taken in seconds regardless of database size.
  • FlexClone spins up a full writable copy of the production database for dev, test, QA, or reporting in seconds, with zero capacity overhead until divergence.
  • Shared NVMe/TCP volumes enable HA failover topologies and database clustering architectures that single-attach disks simply cannot support.
  • SnapMirror replicates the entire database (and its CG) to another region for DR — without involving the database engine or burning host CPU.

The protocol gives you the performance. The platform gives you everything around it.

 

How-To: Provisioning NVMe/TCP Storage to a Linux Host

 

This walkthrough shows how to provision an NVMe/TCP namespace from an ONTAP-mode storage pool and connect it to a Linux VM using gcloud CLI.

Prerequisites

  • A Google Cloud NetApp Volumes ONTAP-mode storage pool with the Flex Unified service level. See Introducing ONTAP-mode for NetApp Volumes for setup guidance.
  • A Linux VM in the same VPC with the nvme-cli package installed.
  • gcloud  installed and authenticated.

 

Step 1: Verify the ONTAP Environment

Start by validating connectivity and inspecting the ONTAP cluster behind your storage pool.

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "vserver show -fields aggregate"
output: |+
  vserver                     aggregate 
  --------------------------- --------- 
  gcnv-a617dd66a2a7176-svm-01 aggr1     
$

Take note of the SVM name (e.g., gcnv-48933899e3d85aa-svm-01) and the aggregate name (e.g., aggr1) — you'll use them in the next steps.

 

Step 2: Create a Volume to Host the NVMe Namespace

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "vol create -volume vol_nvme -size 200G -vserver gcnv-a617dd6
6a2a7176-svm-01 -aggregate aggr1"
output: ''

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "vol show"
output: |+
  Vserver   Volume       Aggregate    State      Type       Size  Available Used%
  --------- ------------ ------------ ---------- ---- ---------- ---------- -----
  gcnv-a617dd66a2a7176-svm-01 gcnv_a617dd66a2a7176_svm_01_root aggr1 online RW 1GB 961.7MB  1%
  gcnv-a617dd66a2a7176-svm-01 vol_nvme aggr1 online RW     200GB    190.0GB    0%
  2 entries were displayed.
$

 

Step 3: Create the NVMe Namespace

The namespace is the block device that will be exposed to the Linux host.

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme namespace create -path /vol/vol_nvme/ns01 -size 100G -o
stype linux"                                                                                     
output: |+
  (vserver nvme namespace create)

  Created a namespace of size 100GB (107374182400).

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme namespace show"
output: |+
  (vserver nvme namespace show)
  Vserver Path                             State      Size Subsystem       NSID
  ------- -------------------------------- ------- ------- ---------- ---------
  gcnv-a617dd66a2a7176-svm-01
          /vol/vol_nvme/ns01               online    100GB -                  -
$ 

 

Step 4: Create the NVMe Subsystem

The subsystem groups namespaces and controls host access — analogous to an iSCSI igroup.

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme subsystem create -subsystem nvme_ss1 -ostype linux \
  -vserver gcnv-a617dd66a2a7176-svm-01"
output: |+
  (vserver nvme subsystem create)


$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme subsystem show -instance"
output: |+
  (vserver nvme subsystem show)

                 Vserver Name: gcnv-a617dd66a2a7176-svm-01
                    Subsystem: nvme_ss1
                      OS Type: linux
                      Comment: 
                   Target NQN: nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:subsystem.nvme_ss1
                Serial Number: lW800J/BQvnWAAAAAAAB
                         UUID: 1410d6ac-631d-11f1-b9ee-bb9b97d72a16
            Peer Vserver Name: -
            Replication Error: -
  Replication Error Subsystem: -

$ 

 

Step 5: Register the Linux Host's NQN

First, retrieve the Linux host's NQN:

# cat /etc/nvme/hostnqn 
nqn.2014-08.org.nvmexpress:uuid:82fdf0dd-7a4f-4dce-8d59-8b2182a7b1e7
# 

Then add it to the subsystem:

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme subsystem host add -subsystem nvme_ss1 -host-nqn nqn.20
14-08.org.nvmexpress:uuid:82fdf0dd-7a4f-4dce-8d59-8b2182a7b1e7 -priority high"
output: |+
  (vserver nvme subsystem host add)

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme subsystem host show -instance"
output: |+
  (vserver nvme subsystem host show)


                         Vserver Name: gcnv-a617dd66a2a7176-svm-01
                            Subsystem: nvme_ss1
                             Host NQN: nqn.2014-08.org.nvmexpress:uuid:82fdf0dd-7a4f-4dce-8d59-8b2182a7b1e7
                        Host Priority: high
            Number of I/O Queue Pairs: -
                      I/O Queue Depth: -
         Authentication Hash Function: -
  Authentication Diffie-Hellman Group: -
                  Authentication Mode: none
                         TLS Key Type: none
               Proximal Vserver Names: gcnv-a617dd66a2a7176-svm-01

$

 

Step 6: Map the Namespace to the Subsystem

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme subsystem map add -subsystem nvme_ss1 -path /vol/vol_nv
me/ns01"
output: |+
  (vserver nvme subsystem map add)

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "nvme subsystem map show -instance"
output: |+
  (vserver nvme subsystem map show)


    Vserver Name: gcnv-a617dd66a2a7176-svm-01
       Subsystem: nvme_ss1
            NSID: 00000001h
  Namespace Path: /vol/vol_nvme/ns01
  Namespace UUID: df5a7d81-31b5-4e8f-b6cc-63272aa6ae80

$

 

Step 7: Identify the NVMe/TCP Data LIFs

Get the IP addresses your Linux host will connect to:

$ gcloud beta netapp storage-pools execute demo-ontapmodepool --location=us-central1-a "net int show -data-protocol nvme"
output: |+
  (network interface show)
              Logical    Status     Network            Current       Current Is
  Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
  ----------- ---------- ---------- ------------------ ------------- ------- ----
  gcnv-a617dd66a2a7176-svm-01
              gcnv-a617dd66a2a7176-svm-01-san-1 up/up 10.165.128.235/32 gcnv-a617dd66a2a7176-01 e0e true
              gcnv-a617dd66a2a7176-svm-01-san-2 up/up 10.165.128.232/32 gcnv-a617dd66a2a7176-02 e0e true
  2 entries were displayed.

$ 

In this example, the NVMe/TCP data LIFs are 10.165.128.232 and 10.165.128.235.

 

Step 8: Connect from the Linux Host

On your Linux VM, ensure nvme-cli and the required kernel modules are loaded:

sudo modprobe nvme-tcp

Discover the targets on both data LIFs for multipath redundancy:

# nvme discover -t tcp -a 10.165.128.232 -s 4420
Discovery Log Number of Records 4, Generation counter 4
=====Discovery Log Entry 0======
trtype:  tcp
adrfam:  ipv4
subtype: current discovery subsystem
treq:    not specified
portid:  2
trsvcid: 8009
subnqn:  nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:discovery
traddr:  10.165.128.232
eflags:  explicit discovery connections, duplicate discovery information
sectype: none
=====Discovery Log Entry 1======
trtype:  tcp
adrfam:  ipv4
subtype: current discovery subsystem
treq:    not specified
portid:  1
trsvcid: 8009
subnqn:  nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:discovery
traddr:  10.165.128.235
eflags:  explicit discovery connections, duplicate discovery information
sectype: none
=====Discovery Log Entry 2======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified
portid:  2
trsvcid: 4420
subnqn:  nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:subsystem.nvme_ss1
traddr:  10.165.128.232
eflags:  none
sectype: none
=====Discovery Log Entry 3======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified
portid:  1
trsvcid: 4420
subnqn:  nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:subsystem.nvme_ss1
traddr:  10.165.128.235
eflags:  none
sectype: none
# 

Connect to a specific subsystem:

# nvme connect -t tcp -a 10.165.128.232 \
 -n nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:subsystem.nvme_ss1
connecting to device: nvme0

# nvme connect -t tcp -a 10.165.128.235  -n nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:subsystem.nvme_ss1
connecting to device: nvme1
#

 

Step 9: Verify and Use the Device

List connected NVMe devices:

# nvme list -o json
{
  "Devices":[
    {
      "NameSpace":1,
      "DevicePath":"/dev/nvme0n1",
      "GenericPath":"/dev/ng0n1",
      "Firmware":"9.18.1",
      "ModelNumber":"NetApp ONTAP Controller",
      "SerialNumber":"lW800J/BQvnWAAAAAAAB",
      "UsedBytes":0,
      "MaximumLBA":26214400,
      "PhysicalSize":107374182400,
      "SectorSize":4096
    }
  ]
}

# nvme list-subsys /dev/nvme0n1
nvme-subsys0 - NQN=nqn.1992-08.com.netapp:sn.6a35a4025ce911f1b9eebb9b97d72a16:subsystem.nvme_ss1
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:82fdf0dd-7a4f-4dce-8d59-8b2182a7b1e7
               iopolicy=queue-depth
\
 +- nvme0 tcp traddr=10.165.128.232,trsvcid=4420,src_addr=10.70.10.33 live non-optimized
 +- nvme1 tcp traddr=10.165.128.235,trsvcid=4420,src_addr=10.70.10.33 live optimized
# nvme netapp ontapdevices  -o json
{
  "ONTAPdevices":[
    {
      "Device":"/dev/nvme0n1",
      "Vserver":"gcnv-a617dd66a2a7176-svm-01",
      "Subsystem":"nvme_ss1",
      "Namespace_Path":"/vol/vol_nvme/ns01",
      "NSID":1,
      "UUID":"df5a7d81-31b5-4e8f-b6cc-63272aa6ae80",
      "LBA_Size":4096,
      "Namespace_Size":107374182400,
      "UsedBytes":0,
      "Version":"9.18.1"
    }
  ]
}
# 

You should now see a new block device (e.g., /dev/nvme0n1). Format and mount it like any block device:

sudo mkfs.xfs /dev/nvme0n1
sudo mkdir -p /mnt/nvme01
sudo mount /dev/nvme0n1 /mnt/nvme01

 

Why This Matters

 

With NVMe/TCP on Google Cloud NetApp Volumes, you get the performance characteristics of dedicated NVMe SAN arrays delivered as a fully managed Google Cloud service, with the operational consistency of ONTAP that thousands of enterprises already rely on. Whether you're migrating SAN workloads from on-premises or building cloud-native high-performance applications, the Flex Unified service level offers the flexibility to mix and match protocols without rearchitecting your storage.

 

Further Reading & Resources

 

 

Get Started Today

 

  • Explore the Google Cloud NetApp Volumes documentation for setup guides and best practices.
  • Provision your first ONTAP-mode storage pool with the Flex Unified service level in the Google Cloud Console.
  • Reach out to your NetApp or Google Cloud account team to discuss workload fit and sizing.

We can't wait to see what you build with NVMe/TCP on Google Cloud NetApp Volumes!

Public