Subscribe

SnapVault deduplication

With DataOnTap 7.3 and above with the data remain deduped upon replicating to the SV secondary device.  I had thought prior to 7.3 that the data would be expanded across the wire and on the secondary device.

Re: SnapVault deduplication

Hi there,

7.3 does not *keep* the data deduplicated with SnapVault.   SnapVault remains a "logical" replication, so if you had primary data that had been dedupe'd, then it would get "rehydrated" in the transfer across the wire to the secondary.

Prior to 7.3, you could only then run dedupe on the secondary once, after the first transfer.

With 7.3, the new piece is that you are allowed to leave single instancing running for the secondary volume.  Data sent is still "rehydrated" in the SnapVault process of identifying changed/new blocks.   After the first transfer, the dedupe process will be run on that secondary data.   Then for every subsequent SnapVault transfer, the dedupe process will run again on the secondary, crunching that data down to return you more space :-)

The online backup docs describe it: see http://now.netapp.com/NOW/knowledge/docs/ontap/rel731/html/ontap/onlinebk/protecting/concept/c_oc_prot_dedupe-with-snapvault.html for example

Cheers

Mike

Re: SnapVault deduplication

Just wanted to say thanks for the very clear explanation -- is the clearest/most detailed explanation I've found so far (and was easier to search/find it on the Communities than NOW ironically).

Re: SnapVault deduplication

mwalters wrote:

After the first transfer, the dedupe process will be run on that secondary data.   Then for every subsequent SnapVault transfer, the dedupe process will run again on the secondary, crunching that data down to return you more space :-)

The online backup docs describe it: see http://now.netapp.com/NOW/knowledge/docs/ontap/rel731/html/ontap/onlinebk/protecting/concept/c_oc_prot_dedupe-with-snapvault.html for example

Cheers

Mike

Just to clarify, because what you describe already happens on 7.2. The only added functionality in 7.3 compared to 7.2 is that you get your dedupde savings faster, because of the following new step:

"A new Snapshot copy replaces the archival Snapshot copy after deduplication has finished running on the destination. (The name of this new Snapshot copy is the same as that of the archival copy, but the creation time of this copy is changed)."

In 7.2 your space saving was locked in the archival snapshot. Now a new archival snapshot is created after the dedupe process and space savings are freed up.

Re: SnapVault deduplication

(I'm slow catching up on the forums here!)

Agreed: however, also in 7.2, the metadata was kept inside the volume, and therefore if you did integrate dedupe with SnapVault (secondary), then you may find the savings gained in subsequent transfers outweighed by the metadata: remember SnapVault is already incredibly efficient !!

For a 7.2 deployment, I would personally recommend only deduping after the first transfer: that is where you (should) get your biggest percentage savings. 

From 7.3, that metadata lives outside the volume and hence will not consume an increasing amount of space :-)

Cheers

Mike