2010-02-10 06:12 AM - edited 2015-12-18 01:36 AM
Is it possible to use SMVI v2 to perform VM consistent snapshots of VMs that have iSCSI LUNs mounted?
I am using vSphere (latest v) and SMVI v2. All my VMs are mounted on an NFS volume on my FAS2020.
All my Exchange, SQL and Sharepoint servers are using SnapManager for backups, and therefore have their data mounted on iSCSI initiator LUNs within the VMs.
Until recently I was creating hourly "non VM consistent" snaps and daily "VM consistent" snaps (according to: http://blogs.netapp.com/virtualization/2009/07/scheduling-smvi.html).
I encountered 2 main problems with this:
1) The daily VM consistent snapshots would always fail on 4-5 of my servers, randomly it seemed. The rest would snapshot OK.
2) I noticed that many of the volumes containing my iSCSI LUNs were rapidly filling up.
It seems that the random server snapshot failures all had 1 thing in common: they all had iSCSI LUNs (why sometimes they worked though, and sometimes not I have no idea!).
It also seems that by performing SMVI "VM consistent" snapshots of these servers, a conflict is caused with SnapManager which also results in the iSCSI LUN (mainly the SnapInfo one in the case of SQL!) being snapped. This happens outside of the control of SnapManager. So in my case, after 1 week my SnapInfo vol reported full - when I checked the snapshots on this vol I could see 7 days worth of SQL snaps (normal) - but also 7 days of additional SMVI snapshots!! (very bad).
This happened on normal VM servers with iSCSI LUNs mapped and my SQL servers.
Since then I read somewhere that VMware do not support VM snapshots of VMs with Microsoft initiated iSCSI LUNs (which rules out half my VMs!) and therefore I've removed these servers from my backup.
Is there any way around this problem?
Longer term my NetApp partner is currently selling me a backup project whereby we will snapshot all VMs and then snapmirror them to a remote site for DR. The problem is that my most important VMs have iSCSI LUNs. My understanding according to the above article is that VM consistent snapshots are important as the VM is quiesced and the snapshots are clean. Therefore my backup idea is not good as basically I'd be mirroring "dirty snapshots" to my remote DR site. Ideally these snapshots should be clean, VM consistent ones (or am I over valuing the importance of this?).
Solved! SEE THE SOLUTION
2010-02-10 11:44 AM
According to VMware KB article #1009073, VMware Tools are unable to create quiesced
snapshots of virtual machines that have NPIV RDM LUNs or Microsoft iSCSI Software Initiator
LUNs mapped to them (this often results in timeout errors during snapshot creation). Therefore,
customers using the Microsoft iSCSI Software Initiator in the guest and running SMVI with VMware
snapshots turned on, which is not recommended, are at high risk of experiencing SMVI backup
failures due to snapshot timeouts caused by the presence of Microsoft iSCSI Software Initiator
LUNs mapped to the virtual machines.
VMware’s general recommendation is to disable both VSS components and the sync driver in VMware Tools
(which translates to turning off VMware snapshots for any SMVI backup jobs that include virtual machines
mapped with Microsoft iSCSI Software Initiator LUNs) in environments that include both Microsoft iSCSI
Software Initiator LUNs in the VM and SMVI, thereby reducing the consistency level of a virtual machine
backup to point-in-time consistency. However, by using SDW/SnapManager to back up the application data on the Microsoft iSCSI Software Initiator LUNs mapped to the virtual machine, the reduction in the data consistency level of the SMVI backup has no effect on the application data.
Another recommendation for these environments is to use physical mode RDM LUNs, instead of Microsoft
iSCSI Software Initiator LUNs, when provisioning storage in order to get the maximum protection level from
the combined SMVI and SDW/SnapManager solution: guest file system consistency for OS images using VSS-assisted SMVI backups, and application-consistent backups and fine-grained recovery for application data using the SnapManager applications.
2010-02-11 03:06 AM
I must admit I missed this earlier:
VMware Tools are unable to create quiesced snapshots of virtual machines that have NPIV RDM LUNs or Microsoft iSCSI Software Initiator
LUNs mapped to them
Another recommendation for these environments is to use physical mode RDM LUNs,
In other words, if you run Windows VMs over iSCSI, use SMVI & SnapManager products, you really should have SnapDrive 6.2 in your environment as earlier versions can't use anything else that MS software iSCSI intiator, correct?
2010-02-11 03:18 AM
I'm not sure I got your question. Even previous version of SDW support Microsoft iSCSI software initiator LUNS. With 6.2 we get the pass through disk support.
2010-02-11 03:27 AM
OK, to make my question complete:
If you use SMVI and would like to have OS-consistent backups of VMs and these VM use some iSCSI RDMs (requiring SnapManager protection), then you need SnapDrive 6.2, as earlier SDW versions require software iSCSI initiatior, hence VMware snapshots cannot be taken, hence SMVI backups won't be OS-consistent.
Does it make sense now?
2010-02-11 06:44 AM
We have tried to explain app consistency in section 9 of the BPG
Could you please take a look?
2010-02-11 06:59 AM
I am not talking about application consistency here. Well, not mainly about this.
Again, there are two choices when using SMVI for virtual OS snapshots:
1) Without VMware snapshots, so OS images are / can be crash-consistent
2) With VMware snapshots, so OS images are in consistent state when NetApp snapshot is taken.
My point is:
If some of your VMs require at the same time SnapManager protection (say for Exchange), then you simply have to use MS iSCSI software initiator, unless you are on SnapDrive 6.2. And MS iSCSI software initiator (as per quoted VMware KB article) is a showstopper for VMware snaps so option 2) is not available.
Of course one may argue: "why bother with OS consistency? it is like a sudden pull of the plug from a server, so checkdisk is the worst what may happen 9 out of 10 times". But this is an entire topic for another discussion
2010-02-11 07:51 AM
SDW 6.1 had support for FCP HBA in ESX server and MS iSCSI initiator in GuestOS for its operations. Now with SDW 6.2 we support ESX iSCSI and iSCSI HBA initiators in SnapDrive. SnapDrive will perform LUN provisioning and snapshot management operations using ESX iSCSI and iSCSI HBA initiators. The kb from VMware does point out an issue with the configuration you pointed out but there are workarounds in place which hopefully will resolve any issues.
2010-02-18 07:08 AM
Some really interesting points made in this thread, but you've also raised a couple more questions regarding my setup and future use:
1) How do I establish ESX iSCSI LUNs within my VMs? I thought I had to use Microsoft iSCSI initiator LUNs for SnapManager, which may have been true before, but apparently now is not necessary? And until now I have only used MS initiator LUNs.
2) Should I consider converting all my VMs with MS iSCSI LUNs to ESX initiator LUNs and is there any downside to this? (I have a bunch of Win 2003/8 servers running Exchange/SharePoint/SQL etc - all using SnapManager with MS iSCSI LUNs and all currently excluded from SMVI "VM snaps" because it's not supported (see original post).
3) I am still experiencing problems with SMVI creating "VM snaps" of VMs which are running any form of SQL server. Even if the VMs don't have iSCSI initiator LUNs. The event logs of the servers give a VolSnap error:
"The flush and hold writes operation on volume C: timed out while waiting for a release writes command."
I have noticed this is happening only on servers which have an MSQL or MSQL Express engine running (unfortunately, quite a lot of my servers run apps that require a local MSQL Express engine i.e. BackupExec). SnapManager is not installed on these servers.
It was mentioned that VolSnap and VSS conflict and that one or the other should be disabled? Should I do this for these servers and if so how??
Lots of useful info here guys, but i am still a little confused as to how I implement a solution to perform clean, VM snaps of all my servers.
2010-02-19 05:44 AM
Steps to configure the ESX SW iSCSI Software Initiator using VI-Client
* Enable ESX iSCSI SW initiator in ESX server.
* Check License under Licensed Features of Configuration Tab.
* Add the VMKernel Port under the Networking configuration option. Enable VMotion.
* Under Security Profile select the port for the Software iSCSI client.
* Select the ESX SW iSCSI initiator and click the "Enabled" checkbox.
Add ESX SW iSCSI targets to discovery list using VI-Client.
* Navigate to the Properties of the iSCSI adapter and click on the Dynamic Discovery tab.
* Choose to Add new Target.
* Enter the IP address & port of the target server and click OK.
2) If you're not having any specific problems with using MS iSCSI then I personally don't see any reason to change it. If its not broken...
Also refer to the following docs as they may have more on performance or general best practices.
The performance best practices :
NetApp and VMware VI3 best practice Guide :