Before I run DeDup on a 1TB volume with 94% full. After the process is completed, it shows me 700GB space got saved(vol efficiency -vserver xx -volume xx -fields dedupe-space-saved). However, when I run vol show on the volume, it is still showing me 94% full.
What steps do I need to go through in order to get the space back?
1. Volume is thin provisioned (space guarantee = none)? If not, you don't get space back.
2. If the volume contains a LUN, the LUN is also thin provisioned and fractional reserve is set to 0%?
3. Any snapshots on the volume? Snapshots that exist before a dedup cycle continue to hold all the space that existed before the dedup cuz they are snapshots. As the snapshots are deleted, they can release that space.
1. Yes, it is thin provisioned, Space Guarantee in Effect: true
2. It is NFS volume, no LUN
3. The only snapshot was taken is the result of snapmirror. Does this snapshot count as well? and this is the reason I could not get the spade back before this SS got removed which will not be until SM replatinship got deleted. Correct?
In this case, should we schedule DeDup process before SS is taken, would this change get the deduped space back?
Yup - the snapshot used for the SnapMirror relationship counts. Any Snapshot counts. Snapshots by design preserve the blocks as they are, even if they are later deleted by anything (such as deleting the files, or dedup, or compression, etc.).
Now, you will get the space back during the next Snapmirror update. Each update creates a new snapshot - so the next update will create a new snapshot of the volume as it is now (the active, deduplicated blocks), the mirror process will analyze the difference between the old and new snapshots and inform the destination that all the old blocks are now not needed. The old snapshot will be deleted, and the space will be recovered on both ends of the mirror.
There is no need delete the mirror relationship to get the space back.
Now, why I did not get any space back, if I enabled postcompression on the deduplicated volume. What type of data the compression would work out better even after deduplication? or usually compression would not needed if dedup got worked well?
One thing you need to remember when you add dedup and compression after the volume creation - they have to be explicitly told to run against all the data in the volume. Otherwise each subsequent space efficiency process only processes data since the last run.
Once you ran the dedup (and perhaps used the "-scan-old-data true option") it would process all active data. If you later add in compression processing and just run another space efficiency option, the already processed data does not get processed again - rather you have to use the "-scan-old-data true" option again to reprocess the dedup-ed data with compression.
Personally I use a volume creation process that automatically sets and runs the initial space efficiency scan at volume creation time on any volume likely to use dedup or compression so that from then on all runs can be normal incremental updates. Going back to re-process old data by constantly starting full scans can be incredibly time consuming. If you are considering adding both dedup and compression to existing volumes, consider strongly whether you can add both at the same time so you need only do one full scan up front. Of course, you need to fully understand your data to know which option is the most liekly to be of benefit.
In the case of the volume you cited in the original post, dedup of 70% indicates highly redundant data - compression may also be helpful especially if the duplicate data is in large chunks, since compression works on blocks of 32K rather than the 4K level. Of course it's also possible that the data is highly redundant but just not that compressible in blocks - I've run into that plenty of times with archived images and sound data for instance where the various files are effectively compressed in the way they are stored (JPG/PNG/etc) but where there tended to be multiple copies of the same files in the structure.
The "volume efficiency start" command you ran in this example will do both a compression and a dedup pass. When you turn space efficiency "on" for a volume dedup is automatic. Compression functions are optional and necessarily include dedup.
Yup - if the display shows no space saved, there wasn't any compressible data in the volume you are working with.
As a convenience, my preference on showing data remains the old 7-mode style "df" command: "df -S -g -vserver vs1 -volume vol1" since it gives you all the space savings details in one shot. Every once in a while there's a better hold-over from 7-mode.
There was a opton, "-snapshot-blocks true", it should compress locked snapshot. This is how I understood. However, the option did not work, and error message was "invalid argument". I used "-b" option instead, and accepted. according to TR-42369, should it do the same as "-snapshot-blocks true", is that true?
The following command would not work neither, it should work according to TR-3966.