Subscribe

SnapMirror Replication Size with Dedupe

Hello everyone,

So I have a customer that is Snapmirroring between site A and site B, its a T1 and we are trying to reconcile the amount of data being replicated.  They are running 7.3.2 at both sites, 3140 at primary and 2040 at secondary.  Dedupe running at primary.

If we add up the snapshots between replication cycles, its about 25% (sometimes even less) of what snapmirror is reporting.  Looked at Kb vs KB, etc, can't correlate where its coming from.  The one thing I found is that as of 7.3, meta data associated with dedupe is at the aggr level, not within the volume itself.  And in the online backup guide it says that you should dedupe the secondary as well as the primary.  So does this mean that hydration is happening with Snapmirror replication?  And that's why its showing larger then what's inside of the snapshots?

Any assistance would be appreciated, thanks everyone!

Chris

Re: SnapMirror Replication Size with Dedupe

Chris,

if you are using VOLUME SnapMirror, you Volume will be transfered block by block in its deduped state (thus saving bandwith) and you do not need to dedup it at secondary.

If you are using QTREE SnapMirror and/or SnapVault, it will be transfered file based, thus the complete undeduped files will be transfered and you need to dedup at the secondary again.

Kind regards

Thomas

Re: SnapMirror Replication Size with Dedupe

Sorry, should have mentioned, I'm using Volume Snapmirror.  I agree that is should replicate deduped data, but according to the online backup manual, as of 7.3 dedupe savings are not included in replication, excerpt from page 111;

Starting with Data ONTAP 7.3 and onward, the deduplication metadata for a volume is placed outside
the volume, at the aggregate level. This can improve the space savings achieved through the use of
deduplication.
When replicating data using volume SnapMirror, the deduplication metadata for the volume is not
replicated along with the volume. The data in the volume is usable both on the source and the destination.
To achieve maximum space savings on the destination volume, scan the entire file system to recreate
the deduplication metadata for the destination volume. Use the sis start -s command to do so.

So with this in mind, is this why there's a difference in the size of my replications versus what's in snapshots?

Thanks,

Chris