2008-03-07 05:05 PM
I am often asked to speak about dedupe to our customers during various NetApp events. One of the most common questions (complaints?) I get is: "Why don't you support dedupe on my legacy systems" i.e. FAS200, FAS800, FAS900, etc. Most people think the reason we don't support these systems is that its some kind of diabolical marketing plot (do marketing people really "plot"?) to make more money by forcing customers to upgrade their old systems. I'd like to explain the real reasons behind our decision not to support these systems.
NetApp first started developing deduplication (then called "A-SIS") about 3 years ago. At that time, deduplication was just beginning to emerge as a viable way to "compress" backups in the D2D world. Our D2D appliance was and still is the NearStore platform. At the same time we were developing deduplication, we were also transitioning NearStore from a hardware-based platform to a software "personality" for FAS systems. NearStore's real value was low-cost, high capacity SATA drives, and since those drives were available on standard FAS systems, starting with the FAS3000 series, the decision was made to "drop" the NearStore systems and use a license to "convert" SATA-based FAS systems to NearStore systems.
OK, back to deduplication. Since dedupe was originally targeted only for D2D, it seemed logical for us to put dedupe behind the NearStore license, to add more value to this new license by adding an additional feature. Trouble is, we did a little too good of a job designing deduplication, and low and behold we have now found that there are many use cases for NetApp deduplication today that go way beyond the original D2D intent that we envisioned.
Now fast forward to 2008. We have over 3,000 systems licensed for dedupe, and people seem to want to run it everywhere, including on their FAS200's and other legacy systems. The problem is that these systems never supported SATA drives, and never supported the NearStore option. In the world of NetApp, adding NearStore and deduplication to these systems is a multi-million dollar effort that requires about 6 months of development and QA testing per platform.
We've had a great deal of back room discussions about how to deal with this problem, a nice problem to have I suppose - but then again someone once told me "there is no such thing as a nice problem." Our dilemma - Should we invest the time and money into adding deduplication to products with a limited life? Or should we use our resources to improve deduplication on the currently supported platforms? We decided on the latter. We are excited about the enhancements that can be made to NetApp deduplication, but at the same time we know that it will require all our engineering resources to produce them in a reasonable timeframe.
So now that you know the story, what's you opinion? Right strategy? Moving the technology forward? Wrong strategy? Leaving our customers behind? Let us know how you feel!
2008-03-08 04:20 AM
For me, this is a very easy question. Next month, all my "mature" systems will be replaced by new FAS30XX and FAS60XX systems, so for me, this problem solved.
But when I look on a distance to this dilemma, I still think that this is the right way to go. Of course, in principle, you must always try to have the new features available on all the current 'running' products, but you must always look to the develop cost in combination of the priorities of the users. Customer forums, user groups and Netapp Product Councils are very important to provide you the correct input, so that you can make the right decision.
Let's go forward with dedup and I look forward to the new dedup enhancement. I have some requests on my list:
larger volume support
dedup on aggregate level (filer level?) instead of flex vol
better snapshot - dedup integration
a very satisfied dedup user,
2008-03-10 08:55 AM
Great question Friea - and glad I got you curious. We've seen two categories of new deduplication usages:
1) New datasets - things like animation files, genomic data, geosesmic data, actuarial data, etc. We tested all the common datasets but we knew we couldn't catch them all. So its always interesting when I hear someone say "hey, I just get 70% space savings with my application!" and its an application we never tested, or even heard of before.
2) Primary storage - this has been our biggest surprise. We knew that users wanted to dedupe primary, but we underestimated how fast they would get there. My best estimate is that somewhere between 30 to 50 percent of our users are deduping some kind of primary app.
Hope that helps-
2009-05-14 09:14 AM
FAS270 supports deduplication but not directly.
If you deduplicate a 'qtree' in a newer device which support depulication and then mirror this qtree over a FAS270C machine, qtree is perfectly mirrored. This only works in qtrees but that means FAS270 support deduplication. It's a commercial problem, not technical.
They should allow to use deduplication on target old devices used as backup systems.
2009-05-14 09:49 AM
Hey Reinoud and anyone else,
My two cents on Dedupe and your comments.
One thing is FAS Dedupe has a limited on Volume sizes that it can Dedupe based on the box sizes, this can be a big gotcha if you have vols 3TB plus.
So you have to make sure thet the boxes you buy in the future support the current sizes you have and the bigger sized you expect before your next refresh.
Also, it is my understanding (after chatting with a number of senior site engineers) is that the long term recommendation of NetApp Engineers for average size environments (60TB - 500TB Useable) with Flex Vols is that Vols should stay under 1.5 TB. If you need to have them as large flat file systems it is recommended to use DFS or some other global name space product to present them as one container.
FYI there are big differences in FAS Dedupe on 7.2.x and 7.3.x
I have found it very frustrating on the part of some of my employers that FAS 960, etc do not support Dedupe, they have large footprints with zero downtime (uptime 500 days+)