Data Protection
Data Protection
Hello everyone!
I am working with an all NetApp environment where SnapProtect v10 R2 sp9 is utilized for backups. Two NetApp systems are being used as primary for a VMware environment. SnapProtect is being used to backup entire datastores and then with an Auxilliary copy creating SnapVault relationships to a third Netapp acting as a backuptarget.
It works fine, however, at times new SnapVault relationsships keeps being created. For example I have the following SnapVault relationsships.
Source                                                               Destination:
primary01.company.com:/vol/DATASTORE01/-    backuptarget01:/vol/SP_DATASTORE01/SP_primary01_DATASTORE01
primary02.company.com:/vol/DATASTORE01/-    backuptarget01:/vol/SP_DATASTORE01/SP_primary02_DATASTORE01
So far so good. But at times it seems either SnapProtect or the NetApp systems wants to create new SnapVault relationships exact duplicates of the ones already present except that "_1" or "_2"
are being added to the Destination path. Example:
Source Destination:
primary01.company.com:/vol/DATASTORE01/-    backuptarget01:/vol/SP_DATASTORE01/SP_primary01_DATASTORE01
primary02.company.com:/vol/DATASTORE01/-    backuptarget01:/vol/SP_DATASTORE01/SP_primary02_DATASTORE01
10.10.10.10:/vol/DATASTORE01/-                       backuptarget01:/vol/SP_DATASTORE01_1/SP_primary01_DATASTORE01
10.10.10.11:/vol/DATASTORE01/-                       backuptarget01:/vol/SP_DATASTORE01_1/SP_primary02_DATASTORE01
Does anyone know or have experienced the same thing and have a lead on the cause of this?
Cheers,
Erik
Do you have one or more VMs that has VMDKs residing in both data stores? ie, virtualdisk1 in datastore1 and virtualdisk2 in datastore2 of the same VM?
Hi Georgevj!
Thanks for the quick reply and good suggestion. However in this case I doubt that a VM has disks on all 8 datastores that are being protected and SnapVaulted.
I just wrote some examples in my first post. So I dont think that its the problem in this case. The chance for VMs to have datadisks on all 8 data stores is non existant.
Could this be a On Command manager issue?
Cheers,
Erik
The other factor that could cause this with use of SnapVault is if you have more than 1 datastore per NetApp volume.
Best practice for datastore or LUN to volume is 1-to-1.
In order to provide the file level replication of the LUN or in this case datastores within a source volume and keep them separated so that you can recover the entire datastore (or LUN) within that source volume, SnapVault datasets within OnCommand Unified Manger will appear duplicated with _x appended to the newly provisioned destination volume name so that each datastore or LUN residing in the single source volume can be backed up separately to each their own destination volume.
The previous update on this discussion is also true in that VMs being backed up that have VMDKs that span multiple datastores will also cause the OCUM provisioning of another destination volume in order to capture the additional datastore/s for those VMDKs.
In such cases the "duplicate volumes" are necessary to provide the level of recovery expected. If you do not want this behavior to occur, please check the VMs being backed up and ensure that the VMDKs for the VMs in question all reside on the same datastore.
Also make sure that the datastore (or LUN) to volume mapping is a 1-to-1 ratio.
Hope this helps.
Hi!
To confirm, yes there are only one datastore/LUN per volume. Could be that I have to create a support case for this. Also this phenomenon seems to happen more randomly than regularly.
/Erik
I believe the note above was referring to the disks belong to the VMs themselves. If a VM spans more than one datastore it could cause some issues with SP. Even with each volume having one datastore, if a VM has a disk on two datastores, that would mean the snapshots would need to be coordinated between the two volumes instead of just one.
In response to the intiial inquiry - I have seen duplicated volumes / snapvault relationships being created within our SnapProtect environment as well. For me it has happened because the dataset is out of "conformance" for one reason or another - that is, the snapvault relationship went stale, because it was being updated for some reason and the souce removed the common snapshot. When the dataset goes out of conformance, SP will recreate the relationship usnig a new volume.
I would have a look at the health of the SnapVault relationships and check for any errors on existing datasets.
Hi Erik,
I'm not sure if you have a bug here or not but this issue can occur if any volume involved in the backups appears in multiple subclients or in multiple storage policies. As rwleshman explained above, this can happen if a VM has disks on multiple datastores, or if a VM somehow gets selected by multiple subclients.
It's worth understanding how the SnapProtect VM backup works in a bit of detail:
1. SnapProtect starts a job for a subclient. Depending on the "content" configuration, the target could be one or more VMs, Hosts, Clusters, Datacentres, Datastores or Datastore Clusters. (It is highly recommended that the subclient content be set to one datastore only, and it is best not to select the other options)
2. SnapProtect connects to the vCenter and enumerates all of the VMs in the selected object (hopefully, a datastore). It gets a list of the disks for those VMs and all of datastores on which those disks reside (not just the datastore you selected in the subclient contents).
3. SnapProtect requests the vCenter make ESX snapshots of the VMs to put the VMDK files into a quiescent state.
4. SnapProtect finds all of the volumes that are associated with the VMs on the NetApp filer and creates a snapshot on each of the volumes. Later, these primary volume snaps will be the source of your SnapVault copies that are configured in your Storage Policy.
5. SnapProtect requests vCenter to release the ESX VM snapshots and resume writes to the main VMDK files.
As an example, let's say you have three different datastores on three different volumes, like this:
You have two VMs with two disks each:
VM #1: oraclehost01:
VM #2: exchangeserver01:
You create two VM backup subclients in SnapProtect:
When the backup for Subclient 1 runs, it will find a VM that has one disk on "vol2_sas_oracledata" and another disk on "vol1_sata_osdisks". On the NetApp storage, it will create a snapshot on both of these volumes - even though your subclient contents contains only "vol2_sas_oracledata". These two snaps will then be the source of your SnapVault relationships for this subclient.
When the backup for Subclient 2 runs, it will find a VM with that has one disk on "vol3_sas_exchange" and another disk on "vol1_sata_osdisks". SnapProtect will create a snapshot on "vol3_sas_exchange", and another snapshot on "vol1_sata_osdisks". When the SnapVault copy runs, these snapshot will become the source of two additional SnapVault relationships. The SnapVault relationship for Subclient 1 won't be re-used by Subclient 2. Hence, you'll get a second SnapVault destination volume provisioned by Unified Manager for vol1_sata_osdisks.
This would also occur if you'd selected only individual VMs (not datastores) in the subclient contents - you'll get a separate set of SnapVault secondary volumes for each new subclient you create. Subclients referencing the same volume(s) will not share primary volume snapshots, and they will not share SnapVault/SnapMirror destination volumes. And the VM protection jobs will find all of the volumes that each VM is using, not just the individual datastore(s) you select.
For these reasons it's strongly suggested that you:
If you're already doing all of these things and you're still getting random new SnapVault secondaries, then you might indeed have a bug, so open a case and let us know the outcome (I'd be curious to know!)
One other thing - you might be tempted to use the datastore filter function to filter out certain datastores. Be aware that there's a bug with this if you're using the backup/classic copies (i.e. tape copies): A VM that has any disk filtered in the snapbackup phase will not get copied in the backup/classic copy - SnapProtect erroneously reports "all disks were filtered" during the backup copy job and then skips that VM. Support haven't given me an ETA on a fix, only that it's a known issue.
