DeDup saved the space but the volume is still taking the same amount of the space

netappmagic · ‎2015-04-20

Before I run DeDup on a 1TB volume with 94% full. After the process is completed, it shows me 700GB space got saved(vol efficiency -vserver xx -volume xx -fields dedupe-space-saved). However, when I run vol show on the volume, it is still showing me 94% full.

What steps do I need to go through in order to get the space back?

bobshouseofcards · ‎2015-04-20

Check the obvious items first:

1. Volume is thin provisioned (space guarantee = none)? If not, you don't get space back.

2. If the volume contains a LUN, the LUN is also thin provisioned and fractional reserve is set to 0%?

3. Any snapshots on the volume? Snapshots that exist before a dedup cycle continue to hold all the space that existed before the dedup cuz they are snapshots. As the snapshots are deleted, they can release that space.

netappmagic · ‎2015-04-20

Thanks a lot for the prompt message.

1. Yes, it is thin provisioned, Space Guarantee in Effect: true

2. It is NFS volume, no LUN

3. The only snapshot was taken is the result of snapmirror. Does this snapshot count as well? and this is the reason I could not get the spade back before this SS got removed which will not be until SM replatinship got deleted. Correct?

In this case, should we schedule DeDup process before SS is taken, would this change get the deduped space back?

bobshouseofcards · ‎2015-04-20

Yup - the snapshot used for the SnapMirror relationship counts. Any Snapshot counts. Snapshots by design preserve the blocks as they are, even if they are later deleted by anything (such as deleting the files, or dedup, or compression, etc.).

Now, you will get the space back during the next Snapmirror update. Each update creates a new snapshot - so the next update will create a new snapshot of the volume as it is now (the active, deduplicated blocks), the mirror process will analyze the difference between the old and new snapshots and inform the destination that all the old blocks are now not needed. The old snapshot will be deleted, and the space will be recovered on both ends of the mirror.

There is no need delete the mirror relationship to get the space back.

netappmagic · ‎2015-04-21

It worked out as you said, and very well.

Now, why I did not get any space back, if I enabled postcompression on the deduplicated volume. What type of data the compression would work out better even after deduplication? or usually compression would not needed if dedup got worked well?

bobshouseofcards · ‎2015-04-21

One thing you need to remember when you add dedup and compression after the volume creation - they have to be explicitly told to run against all the data in the volume. Otherwise each subsequent space efficiency process only processes data since the last run.

Once you ran the dedup (and perhaps used the "-scan-old-data true option") it would process all active data. If you later add in compression processing and just run another space efficiency option, the already processed data does not get processed again - rather you have to use the "-scan-old-data true" option again to reprocess the dedup-ed data with compression.

Personally I use a volume creation process that automatically sets and runs the initial space efficiency scan at volume creation time on any volume likely to use dedup or compression so that from then on all runs can be normal incremental updates. Going back to re-process old data by constantly starting full scans can be incredibly time consuming. If you are considering adding both dedup and compression to existing volumes, consider strongly whether you can add both at the same time so you need only do one full scan up front. Of course, you need to fully understand your data to know which option is the most liekly to be of benefit.

In the case of the volume you cited in the original post, dedup of 70% indicates highly redundant data - compression may also be helpful especially if the duplicate data is in large chunks, since compression works on blocks of 32K rather than the 4K level. Of course it's also possible that the data is highly redundant but just not that compressible in blocks - I've run into that plenty of times with archived images and sound data for instance where the various files are effectively compressed in the way they are stored (JPG/PNG/etc) but where there tended to be multiple copies of the same files in the structure.

netappmagic · ‎2015-04-21

Hello BOBSHOUSEOFCARDS ,

Thanks a lot for the message. While I need some more time to go through your message. I have the following specific question. The dedup has already been enabled. and saved 3% on dedup.

I did the following to enable the compression. My question is, should the 2nd one start only compressing the volume or start both deduping and compressing?

volume efficiency modify -vserver vs1 -volume vol1 -compression true

volume efficiency start -vserver vs1 -volume vol1 -scan-old-data true -b # If i understand correctly, -b is to compress locked snapshot.

As the result, I got 0 compressed, if I run "vol show -vserver vs1 -volume vol1 -fields compression-space-saved", which means I got nothing saved by compressing.

bobshouseofcards · ‎2015-04-21

The "volume efficiency start" command you ran in this example will do both a compression and a dedup pass. When you turn space efficiency "on" for a volume dedup is automatic. Compression functions are optional and necessarily include dedup.

Yup - if the display shows no space saved, there wasn't any compressible data in the volume you are working with.

As a convenience, my preference on showing data remains the old 7-mode style "df" command: "df -S -g -vserver vs1 -volume vol1" since it gives you all the space savings details in one shot. Every once in a while there's a better hold-over from 7-mode.

netappmagic · ‎2015-04-22

volume efficiency start -vserver vs1 -volume vol1 -scan-old-data true -b

I am running 8.2.1P1

There was a opton, "-snapshot-blocks true", it should compress locked snapshot. This is how I understood. However, the option did not work, and error message was "invalid argument". I used "-b" option instead, and accepted. according to TR-42369, should it do the same as "-snapshot-blocks true", is that true?

The following command would not work neither, it should work according to TR-3966.

volume efficiency start -vserver vs1 -volume vol1 -scan-old-data true -compression true -dedupe true -shared-blocks true -snapshot-blocks true

I wanted to make sure snapshots will be included in compressing?

The volume "vol1" that I am working on is 1.6TB volume for vmware vcenter, about 1.2TB are logs and yet snapshots, and that's why I wanted to use "-snapshot-blocks true" optioin.

Should it considered to be nornal that neither compression nor dedup did much saving on logs? I only saved 3% by deduping? Any logic explaination here?

Pls stay with me to get it done.

Thank you!