2008-03-08 02:49 PM
It's not realy unusual, but here is the list of applications where we use today dedupe. Let me know if you find those applications unusual!
dumps of VMWare
dumps of SQL and Sybase
our share with the sources of all our software
medical application in genetics
We will expand this list the next months.
2008-03-09 06:27 PM
Having supported the life sciences community for a few years, when I first heard of de-dupe, I wanted to see if we can use de-dupe algorithms to seek out the commonalities between multiple genomic data sets between organisms, species etc...However, since NetApp de-dupe works on a WAFL block, compares are restricted to 4KB. If that would be configurable, the resulting scenarios would be interesting at best..
that would be a good test for netapp admins currently supporting such data sets.
2008-03-10 08:37 AM
Hi reinoud7 - Can you give us an idea of the space savings (%) you are seeing on each of the applications you are deduping? And yes I'd say that the shares with your software source code and the genetics data are "unusual" - good stuff!
2008-03-10 09:22 AM
Of course, no problem:
dumps of VMWare: today, it's just a VCB kind of backup: 49 % of savings
dumps of SQL and Sybase : 66 - 68 % of savings (7 full dumps of the same database)
our share with the sources of all our software : only 28 %
medical application in genetics : is till testing, but here we only have a saving 8 %
VMWare production : still in test, more details later but at least 50%
I was forgotten this one: all our invoices, send to our patients (more than one million / year): 52 % (this are pdf-files)
2008-06-11 01:50 PM
Hi M_Marotti - I guess by the sound of the crickets in the background no one has tested SAS Institute data. In a case like this, I've seen people take two approaches:
1) Run the Space Savings Estimation Tool against a sample dataset. This tool will simulate dedupe and is available from your NetApp or authorized VAR SE.
2) SnapMirror a copy of the SAS data to a test/dev volume and run dedupe against that volume
Either of those approaches will help give you an idea of the space savings you'll see.
Hope that helps...