Deduplication and reallocation, should I even reallocate?

Please pardon my ignorance if this is already talked about in the deduplication TR as I haven't taken the time yet to fully review...

I've noticed on my fairly small 500G dedupe volume that the reallocate optimization level no longer goes down to 1 after a reallocate, even a full reallocate. Currently the lowest it can go to is 3. To me this makes sense based on the nature of deduplication making data in a volume more dense than without dedupe.

However, I'm curious if anyone has any thoughts or technical background info to share regarding this. Also...

Is it worth it to continue to run reallocates on a deduped volume, or does it actually make things worse as in worse dedupe ratios or performance?

What kind of optimization level should I shoot for?

In my particular case I'm deduping VMware vm's and templates in a 500G NFS volume and am currently achieving a 4:1 ratio for a 75% savings. Since I can't bring the optimization level below 3, I've adjusted my reallocate threshold level to 4, so that it isn't constantly thrashing each week during the scheduled reallocates to bring below 3.

Re: Deduplication and reallocation, should I even reallocate?

Hi Leif,

By default, reallocate runs at the logical level, and any blocks that have been deduped will be skipped.  You can also run reallocate with the -p option, which will force reallocate to run at the PBVN physical level.  In this case, WAFL will try to put the deduped blocks in optimal order, but this might be difficult to do in heavily deduped volumes.  In either case, reallocate does not rehydrate deduped data so space savings is unaffected.

We don't have an official policy on running reallocate on sis volumes, but IMO I wouldn't bother - don't think its really going to give you any substantial gain.  If anyone has experience with before and after performance running reallocate on deduped volume, please feel free to share you experience.


Re: Deduplication and reallocation, should I even reallocate?

Thanks Larry. I was beginning to suspect the value of doing the reallocations. I wasn't familiar with the -p option, thanks for the tip.

Re: Deduplication and reallocation, should I even reallocate?

For what it's worth, I have been running reallocate on dedup-ed volumes for some customers with a threshold of 4 (the default with at least) and it does seem to help somewhat.

Generally speaking though, we only go into reallocate with customers that have specific performance issues or request specific indepth information (as reallocation can be a somewhat deep explanation).

Re: Deduplication and reallocation, should I even reallocate?

The weekly reallocating schedule has made our GroupWise email administrator happy. His nightly consistency checks, which I'm guessing are heavy on the sequential reads, immediately took nearly 40% less time after I began the first full reallocate. I haven't really measured its effects on other applications though.

Re: Deduplication and reallocation, should I even reallocate?

If block reallocation schedule was already set before enabling dedupe then in my opinion it should not be changed because of dedupe.



Re: Deduplication and reallocation, should I even reallocate?

Complete agreement -- that makes a lot of sense.

My only thought/question is....overall a highly deduplicated file system could have the same performance characteristics as a highly fragmented file system (since the deduplicated blocks could be all over the place physically). I'm not sure how reallocate would handle that though (what access pattern would it optimize around).