Data Backup and Recovery
Data Backup and Recovery
We're currently using ASIS Deduplication on our VMware volumes and I was wondering if you've heard of many customers using ASIS on Exchange volumes?
Solved! See The Solution
Exchange does have single instance storage within a database, however its effectiveness is diminished as the number of databases in
an organization increases. My experience with production data is to expect 3-10% savings with deduplication (with the vast majority hovering at 4-5%), which generally means don't do it.
Everything "depends". What are your users like, how much mail do they send/receive a day (change rate), what are they sending? (large attachments?)
The way ESE (jet) places items into the database and later defragments those pages means that a 1MB attachment in 2 databases may not deduplicate very much at all.
Exchange 2010 is a different beast (no single instance storage and page zeroing on by default). Worst case if you run it daily, you should be able to recover your change rate, and after more definite testing with RTM bits, we may see more. Expect guidance for Exchange 2010 soon.
Thanks,
Robert
Exchange has already its own deduplication mechanisme. Therefore deduping on the filer is not worth the effort and overhead.
That might change in the future with Exchange 2010 as I read somewhere that MS is thinking about removing this from their product.
Exchange does have single instance storage within a database, however its effectiveness is diminished as the number of databases in
an organization increases. My experience with production data is to expect 3-10% savings with deduplication (with the vast majority hovering at 4-5%), which generally means don't do it.
Everything "depends". What are your users like, how much mail do they send/receive a day (change rate), what are they sending? (large attachments?)
The way ESE (jet) places items into the database and later defragments those pages means that a 1MB attachment in 2 databases may not deduplicate very much at all.
Exchange 2010 is a different beast (no single instance storage and page zeroing on by default). Worst case if you run it daily, you should be able to recover your change rate, and after more definite testing with RTM bits, we may see more. Expect guidance for Exchange 2010 soon.
Thanks,
Robert
I don't know that it could hurt right now but would likely not be particularly stunning. So far what I've heard from customers is in the 10-20% range for space savings.
I think this will be different come Exchange 2010 as SIS will be going away....publicly mentioned on a NetApp blog no less.
http://blogs.netapp.com/drdedupe/2009/09/exchange-2010-dismisses-sis.html
All the customer data we have seen alignes with what Rob has said. In a 2003/2007 environment we don't see over 10%. With that type of return it really doesn't buy you anything.
As far as Exchange 2010 is concerned, Andrew is correct there is no more SIS, there are also some additional features that may possibly result in better numbers dedupe which may provide a much better return.
Brad
Has anyone here tested ASIS behavior on "majority whitespace" edb files?
Our scenario: two years ago we were doing zero spam filtering (well, technically we tag+forwarded all spam, which was only 50% or so effective). Today we are rejecting 88% of all incoming messages. Our 4TB .edb files only produce about 750GB worth of backup data. Our assumption is that the discrepancy is mostly wasted space as customers receive and retain much fewer mail (we remove unchanged objects on a specific policy that varies from group to group, point being is no one can retain their mail in their primary inbox forever).
Could netapp deduplication reduce our storage allocation closer to actual utilization?
Thank you NetApp for this wonderful community.
-Eugene
This is a challenging question to answer as typically people defragment the Exchange environment (from within Exchange) resulting in the data being spread across the edb file, or so I am told. So it would be hard to judge how much whitespace there would be within the file that could be deduped. As previously said the savings we have see are around the 10-12% mark, but there is about a 5-7% metadata overhead so the net savings could be below 10% (assuming metadata isn't included in the initial percentage). What you could try is getting help from a friendly local SE or Partner and get Flexclone, if you Flexclone the volume and run dedupe on this clone you should see the actual saving you would see on the real volume. Once the test is done you could trash the Flexclone without effecting the primary data. This would of course take resources to run so worth doing during a nice quiet time.
Something else to consider is what would dedupe give you? It may well be the the number of spindles you have are there to deliver the IO requirements of Exchange, so freeing up space couldn't be used by another application, but you could of course use the free space for Snapshots.
A good discusion topic
Regards,
Craig