2011-11-08 03:54 AM
We are seeing a subtle but weird problem with deduplication. It seems that, after some time, deduplication stops happening and, over time, the space savings are gradually lost over time due to the churn on the volume. There are no errors or indications that anything is wrong. It is as if deduplication is not running at all.
sis status -l shows that the deduplication job is running as expected (nightly outside business hours), that it found very large amounts of new data to be processed since the last run, and that it found lots of duplicate blocks. But, without indication, the results of the deduplication operation do not seem to take effect. It's as if it is deciding not to apply the deduplication results at the very last step in the process.
When we first noticed this a while ago, we found that re-running the deduplication process from the top (sis start -sd), rather than relying on the incremental dedupe, cleared the problem and the blocks were correctly deduplicated. But then the deduplication process seemed to mysteriously and silently cease again.
The volumes in both cases are regular NFS volumes hosting virtual machines on VMware and Xen. There are daily snapshots being kept with VSC. We noticed these problems when the volumes were 50% full. The deduplication and snapshot schedules are set up carefully so that the deduplication is complete before the snapshots kick in.
Has anyone else seen this and could provide a suggestion about how we could analyze the cause ?
2011-11-14 05:51 AM
Its an "interaction" between Snapshots and Deduplication.
remember a snapshot is a read only point in time copy of your data. if this changes then the new data is written else where. if this in turn is snapshotted it is protected. Now if the snapshot is kept for a week it is after this point that the blocks are available to the system again, if the data still exsists and is not protected by any other snapshots then it will be deduplicated if the data no longer exsists they become free blocks
2011-11-15 12:08 PM
That's really weird. If the blocks where not duplicates when the snapshot was taken they will be in the snapshot (and you will not be able to dedup them out).
They should age out as the snapshot does though.
Anything in your /vol/vol0/etc/log/sis logs?
2011-11-15 02:50 PM
I did not know that we could find logs there, that's a good tip.
We ran a full dedupe scan the other day but, interestingly, no blocks were freed up - of course, if any new blocks were freed up it could be that they are languishing in the snapshots ..
I'm not sure what the state is with snapshots because, irritatingly, we are hitting a known ONTAP bug which causes all snapshot sizes to report as 0 on a volume where SnapRestore has been used. We're looking at getting an upgrade to 7.3.6 done to get the fix for this ("snap reclaimable" on the volume simply hangs).