2009-01-20 07:28 AM
We have a FAS2020 which we run our Exchange stores on via iSCSI & SnapManager for Exchange.
We also have a number of SQL servers linked up the same way with SnapManager for SQL.
Does anyone know if we can enable sis on these volumes and if we would achieve any kind of benefit from it? (or disadvantage?)
Most of our data is Exchange, SQL and CIFs (file shares) - and unfortunately the CIFs volumes are over .5TB which currently means SIS won't work on them (I believe OnTap 7.3.1 has just fixed this...) - and we haven't yet enabled it on our SQL/Exchange volumes. Therefore our current benefit of sis is very very small.
The only place we are seeing large % space savings is on our NFS volume which is housing our VMDK files.
Would there be a benefit to turning it on for all of our Exchange stores, Exchange log stores, SQL stores and SQL log stores?
sis was a big selling point of the NetApp, and now I am looking for ways to maximise it's useage.
Your help/comments would be much appreciated.
2009-01-21 04:31 AM
You won't get much benefit from deduplication on volumes with Exchange LUNs because there's already a layer of SIS being executed by Exchange. There is a small benefit but not enough to take advantage of the feature. SQL would depend on your data set and if there's any "sparse" object content.
Your CIFS configuraiton is a separate matter. You might find it easier if you can split your CIFS data across smaller volumes and use deduplication rather than one large data set. There's only a little extra overhead and the benefits can be considerable. One way to do this (presuming you have a little extra space in your aggregate) is to take an outage, terminate CIFS, take a snapshot of the volume, 'vol copy' the data to a second volume, then modify your shares to point to data sets in each volume. Restart CIFS and validate your access on both volumes, then when you're comfortable, delete the extra data from both volumes no longer in use (from the Windows clients) until you're within the appropriate deduplication capacity. Use 'vol size' to reduce the size of the volumes and then turn on 'sis'.
If you have SnapMirror, the window can be reduced considerably by initializing a SnapMirror relationship to a second volume on the same controller, then go through the same window steps -- outage, cifs terminate, 'snapmirror update', 'snapmirror break', change CIFS shares, then restart CIFS. You could even use 'ndmpcopy' if you wanted.
You can even do this host-side -- create a new volume, perform a host-side migration, then change the share name to match what the original share name should be.
While it's not the most elegant strategy it will get you to the place you're looking to be. How well things work clearly depends on how your data is structured and if shares can help you separate your data in a clean fashion. Even if you break off a little at a time, it can be more efficient from a capacity standpoint in the long run.
2009-01-21 04:46 AM
Great. Thanks for the advice .
We do have SnapMirror licenses and we we do only currently only have 1 CIFs volume containing all of our shares - so I like the idea of pointing shares to seperate smaller CIF volumes which we can then run SIS on (my original posting was wrong - I actually believe there is a 1TB limit on CIFs volumes on the FAS2020 and not .5TB - but we are already at 900Gb and expect to grow which is why we never enabled SIS).
Anyway, it's given me some ideas so thank you.
2009-01-22 12:08 AM
Just a quick note (for others reading) that ONTAP 7.3.1 is also the first release that allows you to shrink a "previously too large" volume to a size allowed for the FAS platform, in order to enable deduplication to be run against it. it. So Matt's advice would not work prior to 7.3.1, when you would need the dedupe volume to never have been too big for the sizes allowed.
2009-01-22 05:25 AM
Mike is right here -- you would have to be at 7.3.1. If you are running an earlier version, you would need to use 'ndmpcopy' and use -l [0|1|2] to get the data to the new volume. That would let you do two levels of updates (similar to SnapMirror) if you still wanted to follow a similar strategy, as 'ndmpcopy' is copying files and not volume attributes.
Great catch Mike, thanks!
2010-02-04 03:59 PM
Just a quick note! If you decide to enable the DEDUP, make sure that you take these items into your consideration first. First of all check the limitation of the DEDUP against your model. If that specific volume is somewhere near the limitation then won't do it. --> Make sure that you enough have free space on the aggregate level to reverse back if thing went south. It needs almost the same amount of free space to reverse back. If you don't have it then don't do it .. this is very important. Basically you should not do DEDUP when you're running very low of availalble space on the aggregate level --> If you already have the replication/snapmiror in place then you may have to reconsider it since it could possibly break your replication on the other end ---> Make sure that you remove all snapshots associates with that volume first --> Base on my experience it shows pretty impressive number of savings on the VM other than that not much.
2010-02-05 11:50 AM
Dedupe works at a 4K block level. Back in Exchange 2003, with a 4K page size,
there really wasn't an opportunity for dedupe. In Exchange 2007, with an 8K page size, you w
ere still limited mainly because Exchange SIS already took out must duplicate blocks.
In Exchange 2010, SIS is gone. Furthermore, the page size is increased to 32K. On top of that, blank pages in Exchange 2010 are zeroed by defult. Add that up, and Exchange 2010 is really the first viable opportunity for de-dupe to be effective against Exchange data. All except the block containing the page header will dedupe out of blank pages. Since there is no SIS, and only message headers and html/text body parts are compressed, there is a much greater opportunity to dedupe blocks containing attachments. Some preliminary results have shown 20-30% savings for Exchange 2010.