We use Netapp snapshots/snapvault as a way to take backups of VMs.
This is done using a script that does something like this:
1. look for all VMs on a datastore
2. take a VMWare snapshot of all VMs on that datastore to quiesce the .vmdk file
3. take a netapp snapshot of the datastore
4. delete all VMWare snapshots taken in step 2.
All of this is done to ensure that the .vmdk file is not written while taking a Netapp snapshot.
Now, I'm wondering: as far as I know, the NetApp snapshot is an atomic action for an entire volume. So does it actually make a difference to quiesce the VMs beforehand, given that all .vmdk files of a VM are on the same volume ?
We've been successful in restoring non-quiesced .vmdks so I'm wondering if anybody has an opinion on this.
I still believe that there are scenario's where you still can have a corrupt system: but my experiance is that the change is very low.
Now, in this case you are making a backup from your system-disk only? When you are running a database, you do the backup of your data on an other way?
When this is the case, than I think the risc is almost zero and when you take every hour a snapshot (NetApp level) without a vmware snapshot, than you will always have a copy that will boot (from the last 24 hours).
So my conclusion: for the system partition: no risc to do this. For your data, use an other way. We use snapdrive (with the microsoft initiator) in the VM for the backup of our data in our VMWare environment.
I have also a question: the delete of a VMWare snapshot works great with iSCSI (on ESX level) but our test with nfs is not so good. There we see a timeout of the VM for more the 15 seconds when the ESX is deleting a VMWare snapshot. Do you see the same?
+++ <span class="jive-thread-reply-body-container">the delete of a VMWare
snapshot works great with iSCSI (on ESX level) but our test with nfs is
not so good. There we see a timeout of the VM for more the 15 seconds
when the ESX is deleting a VMWare snapshot. Do you see the same?+
You need to change a setting on the ESX server, advanced settings, nfs, nfs.lockdisable = 1
But even then, occasionally ESX snapshots go haywire. That's why I'm considering to only use NetApp snapshots without quiescing the .vmdk files.
(important application data is never in the .vmdk files)
we have made some backup on VMs on datatsores located on NFS.
Netapp has a tool called : VIBE (perl script) available for XP and Windows 2003 or Linux. It works fine.
- specify a specific or a list of datastores
- specific or a list of VMs or ALL
- request a crash-consistent snapshot on ESX --> VMWARE snapshot done before the snapshot of Netapp
- can also be include in your snapmirror scenarios --> Replication/DR
- can also be include in your Snapvault scenarios --> Backup to disk (Neartsore)
the only think that seems 'strange' is that when you have a crash-consistent snapshot include into your Netapp snapshot, when you will do the restore the VMWARE snapshot is back but the snapshot is not listed into ESX Snapshot Manager so you can't delete it from your VC. Maybe need to be done via script.
++<span class="jive-thread-reply-body-container">the only think that seems 'strange' is that when you have a+
crash-consistent snapshot include into your Netapp snapshot, when you
will do the restore the VMWARE snapshot is back but the snapshot is not
listed into ESX Snapshot Manager so you can't delete it from your VC.
Maybe need to be done via script.+
We did a recovery of a machine yesterday and had the same experience. Things are actually even worse, because trying to create additional snapshots on the recovered machine doesn't work any more, the poor thing is totally confused.
This strengthens my idea that it's actually better not to perform a VMWare snapshot before taking a Netapp snapshot. Recovering machines that have associated snapshots seems trouble.
Anybody has the same experience ? Comments ?
about the vmware snapshot restored I think I find a "solution" ...
You can't delete it from the VC .. because not listed into the Snapshot Manager in VM
- create a fake snapshot snapshot on VM level --> can be done on VC or console
- all 'old' snapshot (also snapshot create during VIBE + fake ) are listed
- then you can delete all (unecessary) snapshots
If you're using VIBE with VMware snapshots, you should just those exclusively. Taking VMware snapshots with VIBE as well as VMware snapshots for other purposes can lead to VMware not knowing which snapshot to revert to. Ideally if you need the snapshot of a particular VM after the VIBE backup then simply connect to the datastore and get it from the VM directly.
It is an interesting use case question, though -- why might you need VMware snapshots with VIBE and other tools?
Lack of NFS file locking can cause two hosts to run the same virtual machine at the same time. This is bad and the VM will usually be corrupted very qucikly. This is no longer reccomended by Netapp or vmware. Performance issues around this and snapshot problems appear to be addressed in the latest set of patches
can you give me the best way to implement VMware on Netapp ? What would be the best practice for VMWare on NetApp using NFS? Best practices for backing up VMware, etc..
this would help me in our Virtualization..
also, if you have some best practices also with Citrix/Xen virtualization coz we use both VMWare and Citrix for our Virtualization since we are RnD group.
The place to start is TR3428.
For Citrix, check this out:
Share and enjoy!
I have had nothing but trouble with VMware snapshots, after a failed quiesce I had to manually fix the header information linking the snapshots together in each snap vm file.
I am now relying solely on Netapp snapshots, which I have restored from successfully multiple times. That being said, I would still recommend backing up the VMs in power off state occasionally, just to make sure you have a non-running guaranteed backup.
eCloudManager 2.0 now integrates the Open Source VMFS driver which allows instant access to VMFS LUN snapshots via a simple point and click interface: http://tinyurl.com/kjw3pg
This means that using filer snapshots instead of VMware snapshots is now also possible for non NFS setups.
I'm not sure about NFS, but isn't Snapmanager for VI meant solely for the purpose of making consistent NetApp snapshots by utilizing Vmware snapshot technology?
Not that SMVI handles those Vmware snapshots very well in my experience...
My experience is very similar but slightly different: I find that Vmware does not handle Vmware snaps very well. well thats according to our Vmw. team anyways. Not that I would know.
I'm close to hating VMware snapshots...would almost like them gone from ESX. Yes, that's kind of harsh but they can cause lots of problems when people don't know what they're doing (or assume they work the same way as in VMware workstation).
In short, do NOT keep them around very long -- use something like RVTools to keep them cleaned up -- or you can experience a lot of pain (VM's offline/corrupted/etc. due to huge delta vmdk disk files).
Yes, that's kind of harsh but they can cause lots of problems when people don't know what they're doing (or assume they work the same way as in VMware workstation).
I agree with majority of the post, but I was under impression VMware Workstation & ESX snapshots do use same key principles - 'nothing' happens when the snap is taken, but from that point on all changes to a VM are recorded in a redo log file instead of writing them to the main vmdk file. And the trouble starts when someone tries to get rid of a snapshot in fact putting potentially a lot of pressure on a hardware due to vmdk & redo files being consolidated...
Ok...I think you caught me here.
They do go to a redo log....and a VMware-level snapshot in ESX does immediately impose a performance hit on the VM. I didn't think that there was a performance hit in Workstation but I'm not finding anything to back that up.
So....I'm happy to revise the specifics for my dislike..but the dislike still remains.
I would like to ask you one thing. I am in similiar situation as you described (so snapshot of Exchange server taken months ago, and then forgotten). Last week during try of removing this snapshot (prob ~200GB or something) I don't know if this timed out, or what, but VM stopped responding, then it became "invalid", and it needed to be re-initialized in VC. Finally we got it solved, but still snapshot exists. And now we must remove them, to prevent such problems in future. So my question is - if I would shutdown this VM (exchange) and then remove all snapshot during its powered off, it will probably take ~6hrs to complete, but finally it should remove all crap from snapshots right?
I noticed nice command when you do snapshot removal, to watch what is current status (run on ESX in folder where VMDK of snapshoted VM is placed)
watch “ls –oghut –-full-time *.vmdk”
Thanks for any tips
That would probably work....but to be honest if it's a critical VM, I'd engage VMware support (as this is more a VMware question than NetApp question).
Side-note: sometimes if you see the files for a VMWare snapshot in the file browser but don't see a snapshot to remove, you can add a new VMware snapshot and then remove it (will often clean up the orphaned snapshot files).