NetApp Unveils Support for Microsoft Azure SAN Replication with SMI-S and SnapMirror

This week at Microsoft's TechEd Europe 2014 in Barcelona, NetApp unveiled support for Microsoft’s newly announced Azure Site Recovery (ASR) SAN Replication feature.  This new feature provides cloud-orchestrated disaster recovery service for site-to-site failover, leveraging NetApp Clustered Data ONTAP SAN, our SnapMirror replication and FlexClone Snapshot capability.  ASR SAN Replication is enabled with NetApp’s upcoming release of Data ONTAP SMI-S Agent 5.2 (target ship date is in a few days).


Read about Microsoft's case study on NetApp support for Azure SAN Replication.


NetApp + Microsoft


But wait, you're thinking NetApp has had DR capability for Hyper-V prior to this.  What exactly is new?  Here’s some context. 


Disasters affecting IT systems happen. Physical hardware such as servers and network switches can fail any time bringing critical IT infrastructure down and severely affecting businesses. It’s not only vulnerability of system components that are exposed, but bigger events could bring entire sites down. Historically IT departments have dealt with this challenge with traditional approaches such as backup and restore. With the amount of data companies deal with, this approach might not be ideal as backup can be run relatively infrequently, and restore itself might take significant time. It effectively puts RPO (recovery point objective) and RTO (recovery time objective) outside of companies desired downtime constraints. Business critical applications and systems that are down for extended periods of time can bring companies to their knees.


To remove hard dependencies of business applications from hardware, IT departments are using virtualization technologies. Hypervisors allow applications running on virtual servers to be less dependent on a specific piece of hardware and in some sense to be portable within a single IT site. But this virtual machine still lives within this site with its virtual drive containing several gigabytes or even several hundred gigabytes of data sitting on a storage device. Storage devices these days are pretty fault-tolerant appliances and can still serve data in most disaster cases. But even they cannot guarantee 100%,  that business critical applications remain up all the time, since whole site can go down.


Many storage companies have implemented technology to replicate data to remote sites for disaster recovery purposes. NetApp’s SnapMirror functionality has been in the market for quite a while and has proven that it is robust and can provide enterprise, granular methods of protecting and bringing back critical application data. But data itself is just part of business application, and other part is virtual machines with OS and the application itself needs to be protected along with data. There has been technology to protect virtual machines from different storage vendors. NetApp has its own suite of SnapManager products which utilizes not only Snapshot technologies, but also SnapMirror and FlexClone technologies to get protected data not only replicated to remote sites but brought online in a matter of seconds.


Microsoft has recognized the need of fast recovery of business application through ability to replicate virtual machines to another site. Hyper-V replica is a baseline technology but until now it purely relied on SCVMM and Hyper-V to track changes in virtual machine virtual disks and replicate changes to remote sites. It brings interesting notions of virtual machine presence at multiple locations or VM mobility. It’s a beautiful approach to disaster recovery from virtual machine stand point – “this site is down, I’m going to run at another site”.   This all sound good but it’s a pretty heavy task to handle for SCVMM and Hyper-V when we are talking about hundreds or thousands of virtual machines. What if intelligent storage, hosting virtual machine OS and data virtual disks could offload this replication workload? SCVMM and ASR are still used to control and protect the VMs, but are relegated to an orchestration role, directing the intelligent array to accomplish the real work items.


It’s been an initiative going on for some time for now where storage vendors have worked together on a unified management interface. It gives compute node ability to manage storage the same way, regardless of the vendor. This initiative is called SMI-S (storage management initiative standard) and many storage vendors have built SMI-S agents. Microsoft has recognized the importance of this approach and brought functionality of storage management not only to individual Windows servers, but into SCVMM. The SMI-S specification  has been continually improved over the years by these same vendors.  In the most recent iteration of the SMI-S Specification (1.6.1) DR and BC functions are supported which covers use models of virtual machine OS and data virtual disks.


That’s where the deep integration of technologies from Microsoft and NetApp starts. Microsoft says – “We can replicate virtual machines via the replication of virtual disks, but consumes far too many resources”. NetApp says – “I already host the data, and can easily replicate data for you as well, and you access and control these features through my SMI-S agent”. Microsoft wraps all of this into Azure Site Recovery with SAN Replication package a new additional value for customers running their on-premises private cloud infrastructure and needing Hyper-V orchestrated disaster recovery. Customers now can deploy virtual machines and protect them by making them mobile between sites, between clouds.


Want to dig deeper?  Check out Vinith Menon’s technical blog on ASR SAN Replication.