Data ONTAP Discussions

Best Practice of Deduplication with Snapvault


IHAC that would like to impelent the deduplication with snapvault in both source and destination volumes.

They will take 24 hourly snapshots in the source volume for immediately restore, and do snapvault once nightly to the destination filer. In order to reduce the volume size in both source and traget filers, they will the deduplication aand start it st Sat. Here is the question:

1. What is the best practice to do the dedupe with snapvault?

2. As dedupe with snapshots will decrease the effiency, so they would like to delete all the sanaphosts in the source filer except the snapvault baseline before doing the dedupe; any issue on this?

3. Any other suggestions?




Re: Best Practice of Deduplication with Snapvault

SnapVault (and Qtree SnapMirror) will re-hydrate the data on transfer, so dedup on the source will not change the size of the data transferred.  If the snap delta between updates is 10GB that dedups to 5GB, then 10GB will go over the wire (5GB would transfer if Volume SnapMirror), but 10GB for SnapVault/QSM.  It is not a thin copy like Volume SnapMirror.  So, dedup needs to run on both the source and the target to get dedup on both sides.  The good news is that with ONTAP 7.3+, ONTAP handles dedup after the vault update by recreating the vault snapshot after update and dedup so the space savings isn't held by the snapshot.

Re: Best Practice of Deduplication with Snapvault

Hi Scott,

Thanks for the info. The customer has no concern on the data transfer as they have a big trunk. However, they would like to know if they have to manually delete all the snapshots every Sat, then start the dedupe; that will require a lot of human attention. So, is there a better way or a best practice procedure to satisy this requirement. i.e. :

1. keep 24 hourly snapshots for the source volume Mon-Fri, snapvault to destination filer once everynight.

2. perform dedupe every Sat to reduce the source filer's size.



Re: Best Practice of Deduplication with Snapvault

They could delete snapshots prior to the first dedup, but most customers let the snapshotsroll off on their own and the space savings is seen after the snapshot schedule rolls off the snapshots.  After the initial dedup (sis start -s when there is existing data in the volume) the subsequent asis processes usually run fast so many customers leave the default nightly dedup schedule...but switching to weekly makes sense too...the target vault won't be affected and will dedup separately...which can be a good benchmark to compare dedup to the source on a separate dedup schedule.

Re: Best Practice of Deduplication with Snapvault

Happy New Year

We run a similar setup here.  We keep snapshots & snapmirrors that we would use to recover production data. Between 2 and 3 weeks worth, depending on the application.  The biggest problem with SIS is our snapvault destination is a FAS3140 which has a SIS limit of 4Tb.  Most weekly snapvaults have hit this limit and we have been unable to continue deduping them.  Newer versions of DoT do not require the data rehydrated once this limited is reached but it is still an issue.  A little bird tells me that "next" releases of DoT will remove the SIS limit (Size of aggregate) which will be a big step forward.

We use SME, SMVI and SMSQL to manage the snapshots and snapvault via a script which is called at the end of the snapmanager jobs.

For CIFS shares we have set the snapvault snap sched and use this to manage the creation of snapshots and their aging / removing.

Be sure to use an upto date version of DoT when using SIS as their where many bugs in the early versions.

Hope this helps



Re: Best Practice of Deduplication with Snapvault

Cool.. With 8.0.1 7mode the asis limits for all platforms (that support 8.x) is 16tb.

Typos Sent on Blackberry Wireless

Re: Best Practice of Deduplication with Snapvault


Thanks for the useful info.