Tech ONTAP Blogs

BlueXP disaster recovery service for BC with VMware Cloud on AWS and FSx for ONTAP

banko
NetApp
2,647 Views

As VMware Cloud on AWS customers expand their storage requirements or as new customers adopt VMware Cloud on AWS, Amazon FSx for NetApp ONTAP has become a popular storage service. FSx for ONTAP simplifies deployment whether you are migrating from on premises (VMware on any storage vendor) to VMware Cloud on AWS or expanding an existing VMware Cloud on AWS SDDC. Keep in mind, your SDDC version must be 1.20 or above. For a detailed step-by-step approach for provisioning the datastore, see the resources “VMware Cloud on AWS integration with Amazon FSx for NetApp ONTAP Deployment Guide and “TR-4938: Mount Amazon FSx for ONTAP as a NFS datastore with VMware Cloud on AWS.”

 

Disaster recovery using block-level replication from on premises to AWS Cloud or between regions within the cloud is a resilient and cost-effective way of protecting the workloads against site outages and data corruption events, like ransomware attacks. With NetApp SnapMirror replication, VMware workloads running on-premises ONTAP systems using NFS datastore can be replicated to FSx for ONTAP in a designated AWS target recovery region where VMware Cloud on AWS SDDC resides. Based on an AWS blog post that covered the benefits of using SnapMirror for disaster recovery, a community-supported scripted solution Disaster Recovery Orchestrator (DRO) was released to overcome the siloed scripts approach. Now based on huge customer demand, we are happy to introduce the same functionality called NetApp BlueXP disaster recovery that is added to the BlueXP control plane to simplify business continuity.

 

Using the BlueXP disaster recovery service, which is integrated into the NetApp BlueXP console, customers can discover their on-premises VMware vCenter and AWS SDDC vCenter along with FSx for ONTAP, create resource groupings, create a disaster recovery plan, associate it with resource groups, and test or execute failover and failback. SnapMirror provides storage-level block replication to keep the two sites up to date with incremental changes, resulting in a RPO of up to 5 minutes. It is also possible to simulate DR procedures as a regular drill without impacting the production and replicated datastores or incurring additional storage costs. BlueXP disaster recovery takes advantage of ONTAP’s FlexClone technology to create a space-efficient copy of the NFS datastore from the last replicated Snapshot on the DR site.  Once the DR test is complete, customers can simply delete the test environment, again without any impact to actual replicated production resources. When there is a need (planned or unplanned) for actual failover, with a few clicks, the BlueXP disaster recovery service will orchestrate all the steps needed to automatically bring up the protected virtual machines on VMware Cloud on AWS SDDC. The service will also reverse the SnapMirror relationship to the primary site and replicate any changes from secondary to primary for a failback operation, when needed. 

 

banko_0-1692532273373.png

 

In this blog, we demonstrate how BlueXP disaster recovery enables administrators to easily set up sites, resource groups, disaster recovery replication plans, and simulate or perform a failover.

 

banko_1-1692374651166.png

 

Getting started

To get started with BlueXP disaster recovery, use BlueXP console and then access the service.

  1. Log in to BlueXP.
  2. From the BlueXP left navigation, select Protection > Disaster recovery.
  3. The BlueXP disaster recovery Dashboard appears.

banko_2-1692374718704.png

 

Before configuring disaster recovery plan, ensure the following pre-requisites are met:

  • BlueXP Connector is set up in NetApp BlueXP. The connector should be deployed in AWS VPC.
  • BlueXP connector instance have connectivity to the source and destination vCenter and storage systems.
  • On-premises NetApp storage hosting NFS datastores for VMware is added in BlueXP.
  • Amazon FSx for NetApp ONTAP and AWS credentials is added to the BlueXP working environment.
  • DNS resolution should be in place when using DNS names. Otherwise, use IP addresses for the vCenter.
  • SnapMirror replication is configured for the designated NFS based datastore volumes.

Once the connectivity is established (it can be Direct connect, AWS Transit Gateway or VPN connection between the source and destination sites), proceed with configuration steps, which should take 3 to 5 minutes.

 

Note: NetApp recommends deploying the BlueXP connector in AWS and to the same VPC where FSx for ONTAP is deployed (it can be peer connected too), so that the BlueXP connector can communicate through the network with on-premises components as well as with the FSx for ONTAP and VMC resources.

 

banko_3-1692374791851.png

 

Note: BlueXP disaster recovery is currently in public preview.

 

BlueXP disaster recovery configuration

The first step in preparing for disaster recovery is to discover and add the on-premises and cloud SDDC resources (both vCenter and storage) to BlueXP disaster recovery.

Open BlueXP console and select Protection > Disaster Recovery from left navigation. Select Discover vCenter servers or use top menu, Select Sites > Add > Add vCenter.

 

banko_4-1692374851529.png

 

Add the following platforms:

  • Source. On-premises vCenter.

banko_5-1692374898310.png

 

  • Destination. VMC SDDC vCenter.

banko_6-1692374934665.png

 

In this blog post, I cover disaster recovery between on premises using NFS datastores and VMware Cloud on AWS SDDCs using FSx for ONTAP.

 

What can BlueXP disaster recovery do for you

After the source and destination sites are added, BlueXP disaster recovery performs automatic deep discovery and displays the VMs along with associated metadata. BlueXP disaster recovery also automatically detects the networks and port groups used by the VMs and populates them.

 

banko_7-1692374980729.png

 

After the sites have been added, VMs can be grouped into resource groups. BlueXP disaster recovery resource groups allow you to group a set of dependent VMs into logical groups that contain their boot orders and boot delays that can be executed upon recovery. To start creating resource groups, navigate to Resource Groups and click Create New Resource Group.

 

banko_8-1692375014690.png

Note: The resource group can also be created while creating a replication plan.

 

The next step is to create the execution blueprint or a plan to recover virtual machines and applications in the event of a disaster. As mentioned in the prerequisites, SnapMirror replication should be configured before creating the replication plan.

 

banko_9-1692375057881.png

 

banko_10-1692375084498.png

 

After SnapMirror is in place, configure the replication plan by selecting the source and destination vCenter platforms from the drop down and pick the resource groups to be included in the plan, along with the grouping of how applications should be restored and powered on and mapping of clusters and networks. To define the recovery plan, navigate to the Replication Plan tab and click Add Plan.

 

banko_11-1692375124892.png

 

banko_12-1692375153270.png

banko_13-1692375187477.png

 

After you create the replication plan, you can perform the failover option, the test-failover option, or the migrate option, depending on the requirements. During the failover and test-failover options, you can use the most recent SnapMirror Snapshot copy, or you can select a specific Snapshot copy from a point-in-time Snapshot copy (per the retention policy of SnapMirror). The point-in-time option can be very helpful if there is a corruption event like ransomware, where the most recent replicas are already compromised or encrypted. BlueXP disaster recovery shows all available recovery points. To trigger failover or test failover with the configuration specified in the replication plan, click on Failover or Test failover.

 

banko_14-1692375223692.png

 

What happens during a failover or test failover operation

During a test failover operation, BlueXP disaster recovery creates a FlexClone volume on the FSx for ONTAP file system using the latest Snapshot copy or a selected snapshot of the SnapMirror recovery or destination volume.

Note: A test failover operation creates a cloned volume on the FSx for ONTAP file system. Note: Running a test recovery operation does not affect the SnapMirror replication.

 

banko_15-1692375279008.png

banko_16-1692375297621.png

 

When the test failover operation completes, the cleanup operation can be triggered using “Clean Up failover test”. During this operation, BlueXP disaster recovery destroys the FlexClone volume that was used in the operation.

In the event of real disaster event occurs, BlueXP disaster recovery performs the following steps:

  1. Breaks the SnapMirror relationship between the sites
  2. Mounts the volume at its defined junction path
  3. Register the VMs
  4. Power on VMs

banko_17-1692375341855.png

 

Once the primary site is up and running, BlueXP disaster recovery enables reverse resync for SnapMirror and also enables failback, which again can be performed with the click of a button.

 

banko_18-1692375373420.png

 

And if migrate option is chosen, it is considered as a planned failover event. In this case, an additional step is triggered which is to shut down the virtual machines at the source site. The rest of the steps remains the same as failover event.

 

From BlueXP or the ONTAP CLI, you can monitor the replication health status for the appropriate datastore volumes and the status of a failover or test failover can be tracked via Job Monitoring.

 

banko_19-1692375419823.png

 

To summarize, disaster recovery to cloud is a resilient and cost-effective way of protecting workloads against site outages and data corruption events (for example, ransomware). With NetApp SnapMirror technology, on-premises VMware workloads can be replicated to FSx for ONTAP running in AWS.

 

The benefits of this solution include the following:

 

  • The use of efficient and resilient SnapMirror replication and the recovery to any available point in time with Snapshot copy retention.
  • Full automation of all required steps to recover hundreds to thousands of VMs from storage, compute, and network and workload recovery with ONTAP FlexClone technology using a method that doesn’t alter the replicated volume.
  • Avoiding replication interruptions during disaster recovery test workflows.
  • Optimize VM resources to help lower the horsepower requirements by allowing recovery to smaller compute clusters.

If you are using FSx for ONTAP with VMC SDDC or planning to migrate to VMC SDDC using FSx for ONTAP, BlueXP disaster recovery is here to help. Try it now for free and feel free to follow the detailed step-by-step simulated guidance for configuration at BlueXP Disaster Recovery simulator.

 

 

 

 

 

 

 

Comments
Public