Microsoft Azure Site Recovery SAN Replication with SMI-S 5.2 Deep Dive

This week at TechEd Europe, NetApp unveiled support for a new feature that Microsoft announced today called Microsoft Azure Site Recovery SAN Replication.  Microsoft Azure Site Recovery orchestrates protection of virtual machines running on Hyper-V host servers located in System Center Virtual Machine Manager (VMM) clouds. Using Azure Site Recovery with Hyper-V (VMware is also supported in some scenarios), customers can:

  • Failover and replicate virtual machines from an on-premises site to Azure
  • Failover; on-premises to on-premises using host based Hyper-V replication
  • SAN based VM replication: performs on-premises to on-premises fail-over leveraging SMI-S and SAN offloaded replication.

This blog focuses on the newly announced SAN based VM replication.

 

Virtual Machines hosted on SCVMM clouds can be protected by leveraging ASR, ASR orchestrates failover and also automates replication across sites.

 

NetApp SnapMirror technology combined with Azure Site Recovery enables you to offload replication to the SAN devices that host the Virtual Machine Storage components. This method offers enhanced capabilities to protect your mission-critical applications hosted on Hyper-V Infrastructure.

NetApp SnapMirror

ASR SAN Replication leverages NetApp’s new SMI-S 5.2 agent in conjunction with NetApp SnapMirror to perform failover and replication of virtual machines from on-premises site to another on-premises site.

 

SAN replication provides support for guest clusters and confirms replication consistency across different tiers of an application with synchronized replication. SAN replication also allows you to replicate guest-clustered virtual machines with iSCSI or Fibre Channel storage, or by using shared virtual hard disks (sVHDx).

 

NetApp SnapMirror based replication provides support for guest clusters, and ensures replication consistency across different tiers of an application with asynchronous SnapMirror Replication.

 

What follows are the technology details required to deploy Azure Site Recovery with NetApp Storage. It includes information about setting up your VMM infrastructure with SMI-S 5.2, and instructions for configuring settings in the Azure Site Recovery console.

 

Deployment prerequisites:  On-premises to On-premises SAN

Before you set up on-premises to on-premises protection using Azure Site Recovery with SAN replication the following prerequisites need to be taken care of.

 

Make sure you have an Azure Account which is enabled to use Azure Site Recovery feature.  This Azure account provides a cloud-orchestrated DR Service even though the failover outlined here only covers on-premises to on-premises DR and does not cover on-premises  failover to Azure.

 

A Hyper-V host server cluster must be deployed in the source and target sites, running at least Windows Server 2012 with the latest updates and should be managed by SCVMM 2012 R2 SAN DR preview build procured from Microsoft.

 

At least one private cloud should be configured on the primary VMM server (which would be protected) and one private cloud configured on the secondary VMM server for recovery.

For more details on deployment prerequisites refer to the msdn link Deploy Azure Site Recovery: On-Premises to On-Premises Protection

 

Before you set up SAN arrays in SCVMM 2012 R2 in preparation for Azure Site Recovery protection, note the following:

  • Verify that virtual machines on the source site connected to a VM network, this source VM network should be linked to a logical network that is associated with the cloud.
  • Ensure that the target NetApp Storage array has one or more Aggregates with sufficient free space available to use in this deployment.
  • SMI-S Provider, provider by NetApp should be installed and configured as per the steps outlined in the Installation and Configuration Guide, and the SAN arrays should be managed by the provider.
  • The VMM server at the primary site should be managing the primary NetApp Storage array and the secondary VMM server should be managing the secondary NetApp Storage array.

Configuring Data ONTAP SMI-S Agent 5.2 on SCVMM 2012 R2

Once the above pre-requisites are take care, we need to configure the SMI-S provider to integrate with SCVMM and create storage classifications.

Configuration

Next we would create LUNs on the primary storage using the SCVMM console and host them as cluster shared volumes on the Primary Site Hyper-V cluster. 

Cluster Shared Volumes

Once the LUNs are created, create a replication group which includes all the LUNs that will need to replicate together. This can be done by creating replication groups in the Replication Groups tab of the storage array properties.

Replication groups

Next log in to the Azure Site Management Portal , create and configure an Azure Site Recovery vault. You’ll create the vault and generate a registration key that’s used to authenticate the VMM server with the vault, and verify that the Provider running on the VMM server only responds to Azure Site Recovery.

Recovery Services

For more details on creating a Vault refer to the msdn link on how to Create an Azure Site Recovery vault.

 

After you’ve configured the Azure Site Recovery vault, Install the Azure Site Recovery Provider on Virtual Machine Manager (VMM) servers, and register the servers in the vault.

Installation

For more details on Azure Site Recovery installation refer to the msdn link on how to Install the Azure Site Recovery Provider.

 

Next Create and configure the storage array mapping to specify the mapping between the secondary storage pool receiving replication data from the primary pool. 

Vault

For more details on configuring storage array mapping refer to the msdn link on Configure storage mapping.

 

Next Configure cloud protection settings that will be applied to all virtual machines in the cloud.

 

Azure Site Recovery verifies that clouds have access to SAN replication capable storage, and that the NetApp storage Arrays are peered. If verification is successfully, you’ll be able to select SAN as the replication type. Participating array peers are displayed.

Step 1

..(Continued)

Step 2

Once you configure the Cloud Protection you would see that the array status appears as “Paired”

 

After you configure cloud protection settings, you can map VM networks on source VMM server to VM networks on target servers to failover network settings for the virtual machines. If you don’t configure network mapping virtual machines won’t be connected to VM networks after failover For more details on configuring network mapping refer to the msdn link on Configure network mapping.

10.png

Before you can enable protection for virtual machines you’ll need to enable replication for storage replication groups, which means initiating a SnapMirror transfer across the volumes (aka replication groups) from the primary to the disaster recovery site.

 

In the Azure Site Recovery portal, in the properties page of the primary cloud open the Virtual Machines tab. Click Add Replication Group. Select one or more replication groups that are associated with the cloud, the source and target arrays, and the replication frequency.

replication

The RPO for the replication group is not shown in the Azure management portal, RPO is effectively the schedule specified during the SnapMirror relationship creation process. NetApp supports an RPO of 15 minutes.

PRcloud

Once a storage replication group is replicating, we can enable protection for new or existing virtual machines on the primary cloud. VMM uses intelligent placement to optimally place the virtual machine storage on the LUNs of the replication group. In addition Azure Site Recovery orchestrates the creation of a shadow virtual machine on the secondary site with similar capacity and compute configurations. With ASR’s intelligent placement and capacity reservation it is ensured that the failed over VMs connect to the the correct CSVs and LUNs which are replicated to the secondary site.

 

If you have an existing virtual machine you can configure Azure Site protection by enabling it in the Virtual Machine properties tab or by using the “Manage protection” ribbon.

13.png

Once a Virtual Machine is created VMM sends across this information to ASR and it automatically starts replicating the VM compute information to the secondary site and stores this information in the secondary VMM database, the replica virtual machines created are kept in reserved state. No SnapMirror activities take place at this point of time, also note that any newly created VM’s storage information would be replicated to the secondary storage only when the SnapMirror schedule kicks in and the deltas changes are pushed across.

14.png

Replication Group is directly proportional to number of volumes, which hosts the virtual machines.

 

NetApp recommends grouping and tiering virtual machines to different volumes.

 

This operation must be performed through the OnCommand CLI or by using the NetApp System Manager because the functionality to create FlexVol technology is not supported through the SCVMM 2012 R2 console.

 

Note: NetApp C-Mode FlexVol volumes is termed as a replication group in ASR.

 

Virtual machines that are protected by Azure Site Recovery appear in the Azure Site Recovery portal. You can view properties, track replication health and perform failover of a replication group that contains multiple virtual machines.

 

Note in NetApp SnapMirror replication all virtual machines associated with a replication group must fail over together. This is because failover occurs at the storage layer first. It’s important to group your replication groups properly and place only associated virtual machines together.

 

After you’ve enabled protection for virtual machines you can configure recovery plans. A recovery plan groups virtual machines in selected replication groups for the purposes of failover and recovery, and it specifies the failover order. For more details on configuring network mapping refer to the msdn link on Create and customize recovery plans

 

After a recovery plan has been created, it appears in the list on the Recovery Plans tab, the user can then failover his virtual machines to a disaster recovery site. The user can either opt for a planned/unplanned failover or also do a test failover.

Planned Failover

When a user Clicks on planned failover, it power’s off the VM’s on the primary -> does a SnapMirror resync -> Break SnapMirror relationship and then it discovers the correlates the Lun mapping in Primary server to the secondary server -> It masks the LUN on the secondary and brings the VM’s

 

Online on the secondary site.

 

To execute a failback to the primary site, the user needs to enable Reverse replication which will transfer the delta changes from the recovery site to the primary site, this can be done via the Azure Portal

Reverse Replication

When a user clicks on Reverse Replication, it establishes a reverse SnapMirror relationship -> does a reverse SnapMirror resync -> and all delta changes from the secondary are pushed back to the primary DR site.

 

If you need to remove the protection for a cloud, you can do this by clicking on “Remove Protection”.

Remove Protection

Removing protection for a cloud requires the user to unprotect the VMs hosted on the primary cloud, User can also make a backup copy of the target Luns in azure portal if required as they would get deleted during the removal process. The Remove protection workflow deletes the SnapMirror relationship between the primary site and the recovery site volumes, it also deletes the Luns residing on the secondary site volume.

 

ASR also allows you to execute Testfailover’s, during test failover process clones of Replication Groups/Target FlexVol volumes participating in SnapMirror are created.

 

Stay tuned for an upcoming Technical Report with guidance and best practices outlined for ASR SAN Replication.

 

I hope that you have enjoyed this blog entry and have found this information helpful.

 

Stay Tuned for an upcoming Technical Report for ASR SAN Configuration.

 

Thanks,

 

Vinith Menon

Reference Architect

MVP – System Center Cloud and Datacenter Management.