Hoping if anyone can shed on light on what options I have with our NetApp snapvault/snapmirror setup and best way to restore a snapvault snapshot back into our source filer.
So to summarise
We have 2 sites (Prod and DR) and a NetApp FAS on each site using 7 Mode Data ONTAP 8.2.3p3
We use Snapmirror for Storage replication of all our production datastores presented on our VMware Cluster over to our destination filer. We then have a snapvault schedule which creates longer retentions of the snapshots onto a new destination volume on our DR filer. Currently this works great.
We only have a smaller number of retention using the snapmirror schedule and recently had to restore a server which went over this period so had to rely on the snapvault schedule as this has a longer schedule. The issue was the snapshot was on the DR filer so I had to get a method of moving the files back over to the production filer. When we bring up systems from a snapshot we tend to use flexclone as we most familar with this technology, this is both for SM and SV volumes.
The method I used to bring the server back into our production environment was as follows:
Flexclone volume on DR filer, present to DR VMware environment, check if all is ok
Create new volume on DR filer to the size of the server disks that need to be restored and perform a Storage vmotion from flexclone volume to new volume.
Once complete, create a one time snapmirror job from DR filer to Live filer - this copies the restored server from snapshot back to live filer ready to be deployed back.
All those steps take some time and we found the WAN would get congested and delays in the acutal snapmirror jobs going over from Live to DR filer.
Is there an easier more efficient way to restore a snapvault snapshot back to our live filer without going through the complicated setup above?
There may be more efficient ways to restore snapshots back to production, but I am not sure you really want it. It depends on how data is distributed over volumes. I.e. you can restore only full snapshots, which means - everything on a volume (or qtree). If it is datastore shared by multiple VMs doing it would also revert state of everything on this datastore.
... what may work is
perform incremental restore (snapvault restore -r) from vault
copy needed data to production system
This should transfer only incremental changes over WAN, but leaves your mirror unprotected for duration of restore. I would recommend testing it in lab environment though.
I am not sure if that would be sufficient in this scenario as it be very unlikely (unless we had a disaster) that we want to restore all VMs on a particular datastore back to a point in time.
The issue I see with breaking snapmirror is we wouldnt then be able to deliver our RTO as that would affect a live snapmirror volume from Live to DR whilst this task was being undertaken. I need to be able to contain a restore without affecting any live production schedules.
From the CLI could I use Snapvault Restore to pickout a particular snapshot from a Snap vault volume and somehow restore this back to a source volume or even a seperate volume if I create one manually and then restores this back using the existing WAN link?
Yes, you should be able to do full baseline restore to different volume (actually different qtree, so may be on the same volume), but this will transfer the same amount of data over WAN an your main issue was available WAN bandwidth. My idea was to perform incremental restore, which may transfer less data.
Actually your method may be faster because it moves just one VM, instead of full volume (i.e. full datastore).