Tech ONTAP Blogs

SnapMirror between ONTAP and Google Cloud NetApp Volumes

okrause
NetApp
873 Views

NetApp® SnapMirror® is the technology of choice when it comes to replicating volumes between NetApp ONTAP® based storage systems. And now SnapMirror can be used to replicate volumes between ONTAP and Google Cloud NetApp Volumes.

 

Seasoned ONTAP admins deploy SnapMirror for primarily two use cases:

 

  • Migration. Moving your volume data between systems is hassle-free. All the data and metadata are transferred. Don’t worry about permissions and access control lists (ACLs). It just works. Don’t worry about missing files and retries. It just works. And don’t worry about the copy performance of millions of small files. SnapMirror has you covered.
  • Storage replication as a foundation for a disaster recovery (DR) setup. Spinning up new VMs, containers, and apps in a remote data center is easy and quick if you’re prepared—and as long as these components don’t need data. Moving large amounts of data between your data centers can be time-consuming. And to enable DR in a remote location, you must constantly ship data changes to that location so that the data is quickly available to your applications if disaster strikes. With SnapMirror, your data is quickly, efficiently, and securely replicated to your DR site.

 

For the migration use case, Google Cloud NetApp Volumes already offers the volume migration feature, which uses SnapMirror underneath. NetApp has now added bidirectional SnapMirror support, calling the feature “external replication.” In this blog post, I discuss this feature in detail.

 

Feature genealogy

Our modern-technology world can be confusing. I use simple mental models to keep my thoughts sorted. If a model is good, I can derive the properties and behaviors of a system from it, without having to remember all the details. NetApp Volumes uses SnapMirror for a few different features that are somehow related, but that solve different use cases. Let me share my NetApp Volumes mental SnapMirror model.

 

okrause_2-1748951860791.png

 

As my Venn diagram shows, the following three features are related:

 

  • Volume replication, also known as cross-region replication (CRR). You can replicate a source volume from one Google Cloud region to a destination volume in a different Google Cloud region on a 10-minute, hourly, or daily schedule. This feature supports stopping and resuming replication, and the replication direction can be reversed, meaning that the source and destination volumes change roles. The important thing to remember is that the source and destination volumes are Google Cloud NetApp Volumes.
  • External replication, sometimes referred to as “bidirectional SnapMirror support.” This feature is basically the same as CRR, but the initial source volume is an ONTAP based volume that lives outside of the NetApp Volumes service. After the replication has been established, it offers the same operations (stop, resume, reverse) as CRR.
  • Volume migration, which is basically an external replication in which you can’t reverse the direction. However, it offers a synchronization feature to enable quick cutovers with no data loss and with minimal downtime. The other two features are intended for continuous data replication for DR use cases, but volume migration is built to support fast, simple, and reliable one-time volume migrations from ONTAP to NetApp Volumes.

 

All three features overlap considerably in their APIs and in gcloud CLI usage. The UI workflows are optimized to support an individual use case while reusing as much common functionality as possible. And the underlying replication technology is always SnapMirror.

 

External replication

Now that we have established a mental model of the commonalities of the SnapMirror backed features of NetApp Volumes, let’s dive into the details of external replication. (Remember that it’s replication between an ONTAP based volume and NetApp Volumes?) The lifecycle of every replication goes through multiple phases, which I explain next.

 

Authentication

Like the volume migration workflow, you first must establish a connection between external ONTAP volumes and NetApp Volumes.


In this phase, administrators of the source ONTAP system need to grant NetApp Volumes permission to fetch volumes from a storage VM (SVM) by setting up cluster and SVM peering.

 

Baseline transfer

A baseline transfer creates a NetApp SnapMirror Snapshot™ copy on your source system and replicates all the used data—including all prior Snapshot copies—to the destination volume. Depending on the network speed between the source and the destination and the amount of data, this process can take hours or days. But in the meantime, your source volume is available and can be used to read and write data.

 

After the baseline transfer is finished, the destination volume becomes accessible as read-only, containing the data of the SnapMirror Snapshot copy.

 

Incremental transfers

While the baseline transfer is in progress, a lot of time may pass, and a lot of data may be modified on your source system. But don’t worry, SnapMirror enables incremental transfers. Based on your specified replication schedule, a new SnapMirror Snapshot copy is created on the source system, and the changes between the new and the previous Snapshot copy are calculated. Then only the changed data is transferred during an incremental transfer. Depending on the amount of data that has changed since the baseline Snapshot copy was created, an incremental transfer is typically considerably faster than the baseline transfer.

 

After an incremental transfer is complete, the destination reflects the data of the last replication Snapshot in the read-only destination volume. The replication process sits idle until the next scheduled replication event triggers and the next incremental transfer starts.

 

This process goes on until the replication is stopped by the operator.

 

Switchover

As mentioned, the destination volume reflects the data of the last successful source Snapshot transfer and makes it available as read-only for clients. All is well.

 

But let’s say that disaster struck. Your source site is not available anymore, and there’s a high likelihood that it won’t be available in the next few minutes or hours. Production is down, and your company is at risk of going out of business. This is the day that you have been preparing for. Now you can put your carefully crafted DR plans into action.

 

Your data is already sitting in the destination region, waiting to be used. The first action is to stop the replication at the destination site. This step makes your destination volume read/write and ready to use. You can now start your VMs, containers, and applications on the destination side, using the destination volume as the source of truth.

 

Depending on how well you prepared your deployment procedures for your workloads, this process can take minutes or hours.

 

You should also note that external replication is asynchronous. It always lags behind the source volume, so you lose the latest data that was available on the source volume but that hasn't been replicated to the destination yet.

 

If you can plan for your disaster, like carrying out a DR test to confirm that your procedures work, there’s a better approach. First, stop all workloads on your source volume, perform a manual synchronization operation on the replication, and then stop it and start workloads on the destination volume. With this approach, you know that the latest data is on the destination volume.

 

You are now running on the destination volume, and your source volume is dormant.

 

Recovery

How you clean up after a disaster depends on what kind of disaster it was.

 

If you just stopped the replication to see whether it works, but production on the source side continued, you can simply resume the replication. The destination volume discards all the changes that you made to it, and the source volume starts incremental transfers again.

 

If your production moved to the destination side, the data on the destination volume is now more current and the source volume is outdated. You can now reverse the replication direction to make the destination the new source and vice-versa. The replication is now going in the opposite direction.

 

If you want to reestablish the original direction, do another switchover after you make sure that all your latest data was replicated.

 

If your former source is now a big crater where a structure used to be, it will not come back. To protect your valuable data, you may want to establish a new replication to a NetApp Volumes volume in a different region. For that, you need to delete the old replication and create a new one with your production volume as the source. If your production volume is on ONTAP (but, oh no, the Google Cloud data center is now a big crater), use external replication to replicate to a different Google Cloud region. If your production volume is on NetApp Volumes (let’s hope nobody was injured when your data center transformed into a crater), use volume replication (CRR) to replicate to a different Google Cloud region.

 

Feature availability

[Edit 2025-08-28]: External replication is now available in allow-listed GA. Find out how this feature can enhance your DR strategy to help you stay competitive. To learn more, read the documentation. To test it, contact our Google Cloud specialists.

 

Public