2012-02-08 09:22 AM
Running ESX4 U3 and using NetApp storage. Just wondering what people are doing regarding back and DR. I currently use SMVI to quiesce my VMs then snap mirror them off to my DR site. This works fine but if I have to go to DR I have to
Is there a cleaner way to do this i.e. have a DataStore online in DR where the virtual machines are synced and all I have to do is import them and power them on?
2012-02-13 12:09 PM
Hi, we are using SMVI to do Virtual Machine Backup then we are using VMWare Site Recovery Manager for Disaster Recovery Management. It's a really great product and it will interact very well with NetApp Storage and SMVI.
SRM will use your current Snapmirror relationship to create "phantom" VM at DR site already configured. You will be able to Map every object in the primary site to the disaster recovery site (Resource Pool, Folders, Portgroup) so that when you restart the VM at DR site they get the correct resource allocation and network setup. With some work it will also allow you to not copy vmswap (you can put them in a separate datastore at the sites instead of having them inside the same datastore with the virtual machine) and if you would like to copy less data between sites you can also configure your VM with a separate disk with your windows page file or linux swap file in a separate datastore. In this way you will not need to copy your windows page file/linux swap to DR every day. In the DR site you just need an initial copy of those files and when you configure your DR for VM the first time you can "attach" the empty swap/pagefile to the VM.
In our environment we use the same ip addresses for production/dr but SRM will allow for address/network configuration change in a simple way (some configuration file and scripting if I remember well).
With SRM you can even test your DR in a separate test network (real if you have a network card on each vmware server to connect to a separate switch or virtual with a portgroup created automatically when you run the DR test but in this case VM on different host will not be able to communicate). The DR Test is done via FlexClone so it will allow to test on a real "last backup copy" of your VM.
When you need to run the real DR SRM will also switch off your DR site VM and will start your production VM in the order you choose (you can set priority on how to restart the VM). You can also include "watch point" in the DR plan where SRM stop and ask for confirmation and eventually you can also run script and commands during DR.
At the end of the Test or Recovery you will also have a complete report of what was done. At the end of the test SRM will destroy volume clone and free the resources while after a Recovery you will be able to setup the Failback automatically.
If you're not getting a real disaster but would like to do a planned migration because your primary datacenter will need to be switched off it will also switch off your production VM, update volumes on snapmirror destination and restart the VM in the DR site (you can also do this only for some VM if you need to move only a portion of your datacenter to the DR site).
You can also have a reciprocal protection so you can have 2 datacenter that are one the DR of the other.
We use SRM from 3 years and we are really satified with it. We used it for real during a datacenter core switch fail in one of our site and we used it to move our Exchange environment (NFS VM with iSCSI RDM lun for Exchange DB) from primary to secondary datacenter and then back to the primary after some weeks (we had to manually configure the failback because we did it with version 4 that did not have failback capabilities).
Here you can find more information:
Storage information (here you find how to separate swap file)
SRM 4.1 (not updated with SRM 5 but a good start)