2017-05-24 01:54 PM - edited 2017-05-24 01:57 PM
Let's say I do full backup of a DB in daily basis. Will 2nd full backup data be processed against the deduplication Database which is created by first day full backup?
If yes, the 2nd day data should be reduced a lot. Correct?
My second question:
I read ALT document, and in which it says that we should turn off the compression on the backup software. How do I know or to be sure of that ALTVALUT deduplication will work more efficently?
2017-05-25 12:10 PM
1. Yes, the second full backup will be processed against the previous full, which should reduce the amount of data written by AltaVault significantly (increasing dedupe rates).
2. Deduplication by AltaVault is class leading, compared to many other deduplication methods used by other vendors. Basically, you'll get better squeeze by using AltaVault deduplication and compression, vs. a dedupe method implemented by backup software. This is discussed in the technology vierview TR, which you can read about here: http://www.netapp.com/us/media/tr-4427.pdf
You of course can test this to make sure this will happen with your workloads (we've had this done in sales cycles confirming the statement above).
2017-05-30 02:02 PM
Thanks Chris. Two follow-ups:
1. Is Altavault Dedup a global dedup? Which means regardless NFS or CIFS, regardless from which hosts, all incoming data will be deduped against the same dedup database?
2. Where is the dedup database located, in Memory or in local appliance cache?
2017-05-30 03:10 PM
Q1. Is Altavault Dedup a global dedup? Which means regardless NFS or CIFS, regardless from which hosts, all incoming data will be deduped against the same dedup database?
A1. Yes, dedupe is global regardless of the protocol that sends the data to AltaVault on the front end.
Q2. Where is the dedup database located, in Memory or in local appliance cache?
A2. A portion of the appliance cache is reserved for the dedupe indexes, but this isn't part of the "usable" capacities as reported in the spec sheet materials and other presentations you see for AltaVault. AltaVault does load information into memory for improved lookup, but all data is flushed to cache to ensure no data is lost. The RAID card also has a super capacitor backup to ensure writes are flushed in the event of an outage.
2017-06-06 04:42 AM
We did some deduplication testing with servers using SQLSafe writing directly to a SMB share on the AltaVault. It was determined that leaving the deduplication option on in SQLSafe actually gave us better results than disabling that option. So in effect SQLSafe deduped the data before sending to the AltaVault, which then deduped it again based on the data still on disk. This worked for us because of a low retention rate - database team only wanted to keep 8 days of SQLSafe backups.
If we extended the retention to 14+ days it was more effecient to turn off the deduplication in SQLSafe and relay only on the AltaVault dedup.