Whats your Experience with Inline Dedupe/Compression (cDot 8.3.2)

klmi · ‎2016-06-28

Hi all,

i just made some tests on our AFF8060A (Ontap 8.3.2P2) with inline Dedupe/compression.

Our Target volume with about 100VM´s has a size of 4TB and is connected over NFS

we created a new 4TB volume on our AFF8060
enabled inline dedup/compression
started vmotions of 100VMs to the new volume

With enabled inline dedup/compression (no offline Dedup-Runs) we get following savings: 1,4:1

cluster1::> df -h -S vsds203

Filesystem used total-saved %total-saved deduplicated %deduplicated compressed %compressed Vserver

/vol/vsds203/ 2440GB 1082GB 31% 359GB 10% 723GB 21% vserver1

After we started a manual (offline) efficiency run (efficiency start -vserver vserver1 -volume vsds203) we get much better Savings: 3:1

cluster1::> df -h -S vsds203

Filesystem used total-saved %total-saved deduplicated %deduplicated compressed %compressed Vserver

/vol/vsds203/ 1167GB 2352GB 67% 1630GB 46% 722GB 21% vserver1

Special thing here is, that we need to enable offline (manual or scheduled Dedup) to get better savings

Just wanted to know that is the efficency you get with inline Dedup/Compression on VMWARE Volumes or normal Fileshare volumes in your environment.

Would be fine if you can also share your Experience with inline dedup/compression here in the community.

Best Regards,

Klaus

asulliva · ‎2016-06-28

Hello @klmi,

The dedupe/compression ratios are going to be dependent on the data you have...it's very hard to compare your data and my data to say that what you're seeing is expected.

Inline dedupe is different than post-process dedupe. Inline will only dedupe for data that *currently resides* in memory. In other words, it won't go out to disk to check and see if a matching block exists, but if one happens to already be in cache, then it will dedupe inline. Post-process will compare all data in the volume to find duplicates.

What you're seeing, where inline does not dedupe as much data as post-process, is expected behavior.

Hope that helps.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

klmi · ‎2016-06-28

HI,

@Andrew: thanks for the explanation to this behaviour.

I already noticed this together with our presales, that inline Dedup/compression alone does not help too much.

Unfortuneatly the default Efficiency Settings for AFF-Clusters have Inline Dedup/compression enabled, but no postprocessing (scheduled offline runs).

Maybe you should think to also activate it in later Ontap Releases to get the customer more efficient savings without tuning special options or give the customers real Inline Dedup (without a need for postprocessing).

@NetApp_Community

Please also share your numbers in your efficiency, would be nice to see which Efficiency you reach especially for VMWARE and normal CIFS-Share Volumes

Best Regards,

Klaus

J_curl · ‎2016-06-30

You can certainly do post process on AFF systems. Same way as on non-AFF systems. efficiency policy, either scheduled or chg log.

NetApp_SEAL · ‎2016-07-21

On an AFF platform, when you check the box on a volume to run post-process efficiency operations is for deduplication only.

I think the previous poster may have been getting at the ability to run both post-process deduplication AND compression on an AFF (which, is currently not possible).

This is where things get a little interesting in that in 7-Mode and on FAS/Hybrid environments, the post-process compression operations are secondary (32K) vs. adaptive (4K). You can do secondary on AFF but only inline and as a replacement of the default of inline adaptive (not to mention having to undo all current savings to flip from one to the other, then having to re-run things).

I admit that having the same ability across both FAS/Hybrid and AFF environments would be nice, even if it's a means to just run secondary background jobs just to catch anything missed coming inline, but I suppose that's where Compaction is supposed to pick up the slack.

I've made some good use with running the background scan for migrated / existing data for adaptive compression savings ("-scan-old-data") in some cases.

The issue I'm dealing with currently, however, is with Exchange and migration via mailbox moves from 7-Mode to AFF. The efficiency savings on the AFF side aren't pretty, and are actually worse than just having deduplication enabled on the 7-Mode side. I'm wondering if there's a setting someplace with the LUN that's affecting the savings (thin across the board, 0 fractional reserve, etc...).

When mailbox moves were going on from 7-Mode LUN to AFF LUN, efficiency stats on the AFF volume reported back that 0 inline compression attempts had been made on both the DB and Log volumes. Is this because of the nature of Exchange and that process (where it's just a ton of logs that move over then write / commit tot the DB)? Different than something like a Storage vMotion where you're moving a ton of blocks?

Anyone have any similar Exchange-related experiences they can share?

I'm about to go run the post-process adaptive scan just for the heck of it to see if I gain anything. Figure no harm in trying...

J-L-B · ‎2017-02-09

i'm seeing the same behavior on several of our AFF8080s. All vols have inline dedupe/compression enabled with no background. After enabling manual deduplication - ondemand, then running a full scan, I get back tremendous additional savings. We're running 8.3.2p4. What's the best practice for just enabling schedule background deduplication on AFF's to keep up the maximum savings trend without impacting host side performance?