Hyper-V Disaster Recovery using OCPM 4.0 (On Command Plugin for Microsoft) DR PowerShell cmdlets

The NetApp OnCommand plug-in for Microsoft enables the management and automation of storage and virtual machine provisioning with Microsoft System Center. The NetApp® OnCommand® plug-in for Microsoft® (OCPM) integrates with Microsoft Windows Server® with Hyper-V® and NetApp SnapMirror® software to provide disaster recovery of Hyper-V VMs.

 

 

In part 1 of this tutorial, I show you an end-to-end example of how you can perform DR by using the DR cmdlets that are included with OCPM 4.0 for a Hyper-V cluster hosted on Windows Server 2008 R2 SP1 andon Windows Server 2012. In part 2, I show you how you can use OCPM 4.0 Orchestrator Integration Packs (OIPs) to automate DR, and how you can further integrate with System Center technologies such as System Center Service Manager 2012 (SCSM) and System Center Orchestrator 2012 (SCORCH).

 

Let’s take an example of highly available (HA) and consistent VMs that are running on standalone or clustered Hyper-V hosts on Site A and are serving critical applications. When these VMs go down (complete site failure, both host and storage), in an instant the storage administrator can fail over all resources to Site B. The storage administrator initiates OCPM DR through a Windows PowerShell® cmdlet or SCORCH failover workflows (which are provided and supported by NetApp). These consistent VMs can then serve the mission-critical applications from a replicated or mirrored Site B almost instantaneously, with limited and reasonable downtime.

 

OCPM provides DR PowerShell cmdlets and Microsoft SCORCH 2012 with OIPs to automate disaster recovery, which can be integrated with SCSM.

 

SCSM provides an integrated platform for automating and adapting your organization’s IT service management best practices, such as those found in Microsoft Operations Framework and the Information Technology Infrastructure Library. SCSM provides built-in processes for incident and problem resolution, change control, and asset lifecycle management.

 

SCORCH is a workflow management solution for the data center. SCORCH helps you automate the creation, monitoring, and deployment of resources in your environment.

 

Note: OCPM 4.0 DR PowerShell cmdlets are supported only for NetApp Data ONTAP® operating in 7-Mode. They are not supported in clustered Data ONTAP environments.

Functional Overview

 

OCPM 4.0 provides two methods of implementing a DR workflow. Disaster Recovery can be executed by using OCPM DR PowerShell cmdlets, or through the use of OCPM DR OIP's which integrate with SCORCH.

 

OCPM 4.0 provides:

 

  • Live DR of Microsoft Hyper-V VMs that are hosted on NetApp LUNs connected to Windows Server 2012. (Users can fail over or fail back “running/on” Microsoft Hyper-V VMs without the need to “turn off/shutdown” VMs on the primary site.) NetApp replication (SnapMirror) is used to replicate the VMs and their data across the primary and secondary sites.

 

  • DR of Hyper-V clusters that are hosted on Windows Server 2008 R2 SP1. These clusters need the VMs to be turned off for failover and failback to work.

 

DR Setup and Prerequisites

 

For effective DR:

 

  • Hyper-V servers must be dedicated to DR because they need a clean Hyper-V environment.
  • Hosts must be indicated as clustered or standalone.
  • On a cluster, VMs must be highly available and reside on mirrored LUNs. VMs that are not highly available are not discovered by the New-OCDRPlanUpdate-OCDRPlan
  • Volumes must have SnapMirror.
  • Volumes must be created with a similar size and on the same type of aggregate (large or not large aggregate).

 

For more detail, refer to “OnCommand Plug-in for Microsoft Installation and Administration Guide,” in the section on failover workflow phases.

 

Following is a brief overview of my setup for demonstration purposes.

 

Two Hyper-V clusters are hosted on Windows Server 2008 R2 SP1: clustera on Site A (the primary site) and clusterb on Site B (the secondary site). On Site A, two VMs are running, which would be failed over to Site B. Two Hyper-V clusters are also hosted on Windows Server 2012: win2k12clustera on Site A and win2k12clusterb on Site B. Similarly, on Site A, two VMs are running, which would be failed over to Site B.

 

Site A

 

 


 

Site B

 

 


 

 

Site A is connected to a storage controller named siteA7mode.

 

 

SiteB is connected to a storage controller named siteB7mode.

 

 

Both controllers include a SnapMirror license.

 

Next, install and configure OCPM 4.0 on all Hyper-V hosts. Open the OnCommand Cmdlets PowerShell console on the Hyper-V host that is the current owner of the cluster on Site A (clustera or win2k12clustera).

 

In my case, it’s HYPERV2K8R2-2 for the Windows Server 2008 R2 SP1 cluster and HyperV-Win12k-1 for the Windows Server 2012 cluster.

 

Make sure that you have set the execution policy to remote signed across all Hyper-V hosts that are participating in DR.

 

 

Failover from Site A to Site B

 

Initialize a SnapMirror relationship between volume vol1 on Site A, which contains the HA VMs, and Volume vol3 on Site B, where the VMs would eventually be failed over in the case of DR

 

For the Windows Server 2008 R2 SP1 cluster, this step is:

 

PS C:\> Initialize-OCDRMirror -SourceLocation siteA7mode:vol1 -DestinationLocation siteB7mode:vol3 -Verbose

 

And for the Windows Server 2012 cluster, it is:

 

PS C:\> Initialize-OCDRMirror -SourceLocation siteA7mode:vol4 -DestinationLocation siteB7mode:vol5 -Verbose

 

 

 

You can check the status of SnapMirror by using Get-OCDRMirrorStatus.

 

For the Windows Server 2008 R2 SP1 cluster, this step is:

 

PS C:\> Get-OCDRMirrorStatus -Location siteB7mode:vol3

 

And for the Windows Server 2012 cluster, it is:

P

PS C:\> Get-OCDRMirrorStatus -Location siteB7mode:vol5

 

 

When you see that the SnapMirror transfer has been successfully completed, you need to create a new DR plan with the New-OCDRPlan cmdlet.

 

For the Windows Server 2008 R2 SP1 cluster, this step is:

 

 

New-OCDRPlan -drplanname DRPLAN -drplanfolder "C:\kbs" -primaryserver clustera -secondaryserver clusterb -verbose

 

And for the Windows Server 2012 cluster, it is:

 

New-OCDRPlan -drplanname DRPLAN -drplanfolder "C:\kbs" -primaryserver win2k12clustera -secondaryserver win2k12clusterb -verbose

 

 

This command creates a new DR plan. You can specify a file path and a file name for the plan. Also, make sure that you run this cmdlet from the Hyper-V host that owns the cluster. In this case, it is:

 

HYPERV2K8R2-2 for clustera and HyperV-Win12k-1 for win2k12clustera

 

If the file path is not specified, then the plan is created in the OCPM program data folder; for example, C:\ProgramData\OnCommand\MS_Plugin. If the file name of the plan is not specified, then a default plan name is created by using the following convention:

 

PrimaryServerOrCluster_SecondaryServerOrCluster_DRPlan.xml

 

On Windows Server 2012 platforms, there is no constraint to “turn off” the Hyper-V VMs. DR plans are generated with live or running VMs. With Windows Server 2012, the VMs are not exported and their VM configuration files are not generated. However, on Windows Server 2008 R2 SP1 platforms, the VMs are exported and their VM configuration files are generated along with the DR plan file.

 

In the following figure, you can see that I create the DR plan for the primary server site, clustera, and for the secondary site, clusterb, and I store the DR plan at a local share location, "C:\kbs".

 

 

For Hyper-V clusters based on the Windows Server 2008 R2 SP1 operating system, make sure to shut down the VMs on Site A (clustera) after the DR plan is created. You will see the DR plan created in the Windows Share, and the cmdlet returns a Success. This is not required for Windows Server 2012 Hyper-V clusters; VMs need not be turned off, everything is done live.

 

The following two figures show this step for the Windows Server 2008 R2 SP1 cluster:

 

 

 

As you can see in the following figure, when the same cmdlet was run for Windows Server 2012, it created only the DR plan and did not create the VM configuration export files.

 

 

Next, we must validate and confirm the newly created DR plan with the Confirm-OCDRPlan cmdlet:

 

 

Confirm-OCDRPlan -drplan "C:\kbs\DRPLAN_DRPlan.xml" -Verbose

 

 

This cmdlet validates the current state of either the primary storage system or the secondary storage system, based on the information in the specified DR plan.

 

After the cmdlet completes its task, you see an operation status of Success.

 

Now imagine that a disaster strikes Site A and you must migrate all your VMs immediately to Site B.

 

For this, you invoke your next cmdlet, which is Invoke-OCDRFailover.

 

This cmdlet should be invoked from one of the Hyper-V hosts on Site B, preferably the cluster owner. In my current setup, it is the node HYPERV2K8R2-3 from Site B.

 

For the Windows Server 2008 R2 SP1 cluster, this step is:

 

PS C:\> Invoke-OCDRFailover -drplan "\\hyperv2k8r2-2\C$\kbs\DRPLAN_DRPlan.xml" -Verbose

 

And for the Windows Server 2012 cluster, it is node HyperV-win12k-3 from Site B:

 

PS C:\> Invoke-OCDRFailover -drplan "\\hyperv-win12k-1\C$\kbs\DRPLAN_DRPlan.xml" -Verbose

 

 

On the Windows 2008 R2 SP1 and Windows 2012 platforms, you can recover your primary site Hyper-V VMs and bring them online on your secondary sites. You can also restore your primary online standalone or HA VMs on secondary sites with very limited downtime.

 

This cmdlet rebuilds the VM configuration on Site B, the secondary site for a failover. When this cmdlet is invoked, based on the DR plan, it breaks all the SnapMirror relationships on the secondary storage system; it connects all the LUNs on the secondary host; and it also restores all VMs on the secondary host.

 

 

After the cmdlet completes its activity, you will see that you have successfully failed over the storage and VMs from Site A to Site B. Now Site B is your primary site.

 

 

 

 

If you specify the -Online parameter with this cmdlet, you can see that the VMs are turned on. In the preceding step, I did not specify this parameter, so the VMs are off.

 

Failback from Site B to Site A

 

Now let’s assume that Site A is up and you need to fail back your resources from Site B to Site A. You must update the DR plan by reversing the move from the primary site to the secondary site. To attain this, you must run Update-OCDRplan with a -Failback parameter.

 

For the Windows Server 2008 R2 SP1 cluster, this step is:

 

 

PS C:\> Update-OCDRplan -Failback -drplan "\\hyperv2k8r2-2\C$\kbs\DRPLAN_DRPlan.xml" -PrimaryServer clusterb -SecondaryServer clustera -Verbose

 

 

And for the Windows Server 2012 cluster, it is:

 

 

PS C:\> Update-OCDRplan -Failback -drplan "\\hyperv-win12k-1\C$\kbs\DRPLAN_DRPlan.xml" -PrimaryServer win2k12clusterb -SecondaryServer win2k12clustera -Verbose

 

 

Whenever the primary or secondary site configuration changes, you must update and validate the DR plan. If the validation fails, that means that the plan is not up to date and must be refreshed. This cmdlet refreshes the plan on the primary or secondary site with the latest configuration information.

 

 

   (Continued)

 

 

After this cmdlet has successfully completed its task, you must establish a reverse resynchronization of the SnapMirror relationship.

 

For the Windows Server 2008 R2 SP1 cluster, this step is:

 

From siteB:vol3 to siteA:vol1 by using Invoke-OCDRMirrorReverseResync

 

 

PS C:\> Invoke-OCDRMirrorReverseResync -drplan "\\hyperv2k8r2-2\C$\kbs\DRPLAN_DRPlan.xml" -Verbose

 

And for the Windows Server 2012 cluster, it is:

 

From siteB:vol5 to siteA:vol4 by using Invoke-OCDRMirrorReverseResync

 

 

PS C:\> Invoke-OCDRMirrorReverseResync -drplan "\\hyperv-win12k-1\C$\kbs\DRPLAN_DRPlan.xml" -Verbose

 

 

This cmdlet reverses the initial resynchronization process and resynchronizes the mirror relationships from the secondary to the primary storage system based on information in the specified DR plan.

 

Alternatively, you can also specify destination and source locations for resynchronization. If SnapMirror configurations from the secondary site to the primary site existed before the failover, this cmdlet reestablishes the SnapMirror copy configurations after the failover is finished. If SnapMirror copy configurations from the secondary site to the primary site did not exist before the failover, this cmdlet creates them. The original production site then becomes the active production site again.

 

This cmdlet requires that you specify either a plan or destination and source locations for executing the resynchronization operation, and the cmdlet must be issued on the Hyper-V host. The reverse resynchronization transfer is handled asynchronously; therefore, you must wait until the transfer is finished before you execute any additional operations.

 

 

 

Next you must clean up any conflicting or stale entries existing on Site A (clustera) so that you can fail back the resources to Site A. For this step, you must run Reset-OCDRSite.

 

For the Windows Server 2008 R2 SP1 cluster, it is run from HYPERV2K8R2-2 (clustera):

 

 

PS C:\> Reset-OCDRSite -drplan "C:\kbs\DRPLAN_DRPlan.xml" -Verbose -Full -Force

 

For the Windows Server 2012 cluster, it is run from HYPERV-WIN12K-1 (clustera):

 

PS C:\> Reset-OCDRSite -drplan "C:\kbs\DRPLANwin2k12_DRPlan.xml" -Verbose -Full -Force

 

This cmdlet deletes or disconnects all the conflicting cluster resources or LUNs that are on the secondary site based on information in the DR plan.

 

   (Continued)

 

Now when you head over to Site A, you see that the VMs and storage have been cleaned up.

 

 

 

Next, run Confirm-OCDRPlan before invoking a failback to make sure that your DR plan is set right:

 

PS C:\> Confirm-OCDRPlan -drplan "C:\kbs\DRPLAN_DRPlan.xml" -Verbose

 

Now that your Site A is clean, you should invoke failback from Site B to Site A. For this step, you use the Invoke-OCDRFailback cmdlet. In this example, you can see that I invoked the failback cmdlet with an -Online parameter, which turned on the VMs after the import.

 

You must run this cmdlet from the node HYPERV2K8R2-2 from Site A for the Windows Server 2008 R2 SP1 cluster:

 

 

PS C:\> Invoke-OCDRFailback -drplan "C:\kbs\DRPLAN_DRPlan.xml" -OnlineVM -Verbose -Force -OnlineVM

 

For the Windows Server 2012 cluster, this cmdlet is run from the node HyperV-Win12k-1 from Site A:

 

PS C:\> Invoke-OCDRFailback -drplan "C:\kbs\DRPLANwin2k12_DRPlan.xml" -OnlineVM -Verbose -Force -OnlineVM

 

 

 

After the cmdlet completes its task, you can see the failed-over VMs from Site B on Site A.

 

 

 

 

If you specify the -Online parameter with this cmdlet, you can see that the VMs are turned on.

 

I hope that you have enjoyed this blog entry on how to perform DR with OCPM PowerShell cmdlets and have found this information helpful.

 

Good Luck !

 

Thanks,

Vinith

Comments

Hi, great post, been looking to do something like this. However I have an issue.

When I create the DRplan on the primary node using the command "New-OCDRPlan -drplanname DRPLAN -drplanfolder "C:\kbs" -primaryserver hv-cluster -secondaryserver drhv-cluster -verbose" I get the following error:


The virtual machine <crg-sql> is on <hv1> and cannot

be exported for creating the Disaster Recovery plan.

This operation must be invoked from the node that

owns all virtual machines and the cluster group (if

FOC) and must have OnCommand Plugin for Microsoft

installed. New-OCDRPlan and Update-OCDRPlan must be

invoked on the primary site A.  Update-OCDRPlan

-Failback must be invoked on site B, the original

secondary site.

If I run this on the other node, I get the same error with different VMs.

What am I doing wrong?

Thanks

From cluster manager, highlight your cluster, then determine which node is the "owner node" or "Current Host Server".

Then live migrate all VMs to that node and try running the command again.