ONTAP Discussions

Dedupe turned off, should undedupe?

defrogger1
5,897 Views

Hello, ive taken over a new NetApp Env.  We have 2 FAS3240's running 7-mode version 8.2.3. 

 

Ive found there are a nice chunk of volume with Dedupe scheduled on them.  However its really not doing anything for us as some of these volumes are setup on a one to one basis.  Meaning, we have a 1TB Volume, then a 1 TB LUN created from that Volume, and in Vmware the entire 1 TB LUN is used to create a 1 TB Datastore.  There is no thin provisioning on the Volumes or Luns.  There are also no snapshots being used.  So Deduping isnt doing anything for us on these type of volumes as vmware will never see this extra free space.

 

So I turned Dedupe off on a couple Volumes, I dont see the need to have it keep working to run dedupes every night which can cause a small performance hit. 

 

So on these Volumes that ive turned Dedupe off, my understanding is that the volume is still deduped it just wont dedupe any new data going forward.  I guess my first question is that correct?

 

My next question is, by keeping that Volume in that state, is there a performance hit on it, even if its small?  Im assuming if someone requests data from a block that has been deduped, it first has to look up where that block is located in the dedupe table which is more overhead.  Is that correct?


Im asking becuase im wondering if I should UnDedupe these volumes?  Give the controllers a little less work to do

If so, is there anything i should watch out for?  Im assuming its smart enough to not run out of space while undeduping?  Or is that not the case and my volumes might run out of space when I Undedupe?

 

For the most part we have free space on the Volume, but im wondering if there should be a certain percentage i need free to undedupe.  For instance i have one Volume that is roughly 1 TB, The Vmware Datastore is only using about 750 Gigs of actual Data, so technically there is 250 Gigs free from the Vmware side of things.  The Dedupe has shrunk the volume allot, I dont remember the number but I think its around 60% free.  Once i Undedupe the Volume will have no free space which is fine.

 

Im just a little nervouse about Undeduping, im going to look into a little more on what its doing in the background first

Thanks for any input

1 ACCEPTED SOLUTION

ekashpureff
5,866 Views

 

Defrogger -

 

Your are correct that turning dedupe off  wont dedupe any new data going forward.

 

There isn't a performance hit to leave the data deduped.

The dedupe fingerprint tables are used to dedupe data, not when reading deduped data.

 

There is a performance gain for leaving the data dedeuped.

Deduped blocks only have one copy cached in memory or in FlashCache.

This is a big win in VM environs where there are often many copies of the same OS.

All those duplicate blocks are only cached once, instead of many copies of duplicate blocks for each VM.

 

Note that 'sis undo' is an advanced priv command in 8.2, and is not documented in the man pages.

 

You may wish to look at NetApp TR-3958 NetApp Data Compression and Deduplication Deployment and Implementation Guide

 

From TR-3958:

"15.6 Removing Space Savings
NetApp recommends contacting NetApp Customer Success Services prior to undoing deduplication or compression on a volume to determine that removing the space savings is really necessary. In many cases system performance can be restored by finding the true source of the degradation, which often can be unrelated to compression or deduplication.
It is relatively easy to uncompress or undeduplicate a flexible volume and turn it back into a regular flexible volume. This can be done while the flexible volume is online, as described in this section.
Undo operations can take a long time to complete. The time to complete is relative to the amount of savings and the amount of available system resources. You can view the progress of undo operations by using the sis status command. During the time when undo operations are running there may be an impact on other activity on the system. Although undo operations are low-priority background processes they do consume CPU, memory and disk resources and, as such, they should be scheduled to run during low usage times. We do not limit the number of undo operations you can run in parallel; however, the more that run, the more impact will be seen. If you determine that the impact is too great with so many in parallel or during peak times you can stop the operation using the sis stop command, which will stop the current undo. You can later restart the operation and it will continue from where it stopped.
Note: Undo operations will only remove savings from the active file system, not within Snapshot copies."


I hope this response has been helpful to you.

 

At your service,

 

Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)

 

 

View solution in original post

2 REPLIES 2

ekashpureff
5,867 Views

 

Defrogger -

 

Your are correct that turning dedupe off  wont dedupe any new data going forward.

 

There isn't a performance hit to leave the data deduped.

The dedupe fingerprint tables are used to dedupe data, not when reading deduped data.

 

There is a performance gain for leaving the data dedeuped.

Deduped blocks only have one copy cached in memory or in FlashCache.

This is a big win in VM environs where there are often many copies of the same OS.

All those duplicate blocks are only cached once, instead of many copies of duplicate blocks for each VM.

 

Note that 'sis undo' is an advanced priv command in 8.2, and is not documented in the man pages.

 

You may wish to look at NetApp TR-3958 NetApp Data Compression and Deduplication Deployment and Implementation Guide

 

From TR-3958:

"15.6 Removing Space Savings
NetApp recommends contacting NetApp Customer Success Services prior to undoing deduplication or compression on a volume to determine that removing the space savings is really necessary. In many cases system performance can be restored by finding the true source of the degradation, which often can be unrelated to compression or deduplication.
It is relatively easy to uncompress or undeduplicate a flexible volume and turn it back into a regular flexible volume. This can be done while the flexible volume is online, as described in this section.
Undo operations can take a long time to complete. The time to complete is relative to the amount of savings and the amount of available system resources. You can view the progress of undo operations by using the sis status command. During the time when undo operations are running there may be an impact on other activity on the system. Although undo operations are low-priority background processes they do consume CPU, memory and disk resources and, as such, they should be scheduled to run during low usage times. We do not limit the number of undo operations you can run in parallel; however, the more that run, the more impact will be seen. If you determine that the impact is too great with so many in parallel or during peak times you can stop the operation using the sis stop command, which will stop the current undo. You can later restart the operation and it will continue from where it stopped.
Note: Undo operations will only remove savings from the active file system, not within Snapshot copies."


I hope this response has been helpful to you.

 

At your service,

 

Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)

 

 

defrogger1
5,833 Views

Wow i didnt realize that dedupe could help with performance.  If leaving it Deduped wont hurt performance then i definitly wont undedupe things unless there is an issue. 

 

Thanks

 

Public