2008-09-17 08:32 AM
We have only just invested in our netapp kit and were using a 3020c and vmware with iscsi.
Im looking into using dedupe for the vms and am a little overwhelmed by the whole subject being totally new to the field.
From what Ive gathered you get a dedupe licence from your account manager (which Ive applied for one), install it then turn dedupe on. I understand I need to uncheck the space reserved switches whebn creating the volumes and luns top thin provision so my first question is..
1) If dedupe runs on a volume level then I guess you get more dedupe savings if you place more vms on a singe volume so what would people recommend for the optimal layout? IE, one big volume (well 2 as we have two filers) with one machine machine per lun or multiple per lun. (I understand its not recommended to have more than 10-15 per lun) or the use of multiple volumes etc.?
The whitepaper TR3428 (pp47) states that transient data should be placed on seperate Luns and Volumes so my next question is,
2) Is the reason for this because much of this transient data is "unique" so would dramatically reduce available snapshots as there are effectively more changes or is it so dedupe will reclaim more space if just the OS files (non-transient) are on the same volume or is it the fact that youd place the transient data on a volume with no snapshotting policy to save space or a combination of all three or alternatively am I grossly misunderstanding the whole concept?
finally, here is a quote from the VM forums regarding asis, seems a bit anti-netapp biased to be honest, but what are your thoughts?
Thanks very much in advance for any replies and apologies for asking such basic questions.
2008-09-19 11:39 AM
Hi Michael, good question and actually one that comes up pretty regularly. Here's the problem - since dedupe is relatively new, alot of our documentation is pre-dedupe and does not take it into account. The good news is that there are two new documents, "Configuring LUNs with Deduplication" and "Secrets to Shrinking VMware Storage" that are now posted on the dedupe community. I think you'll find these docs to be helpful in clearing up any confusion. The note you attached that says "there is no space savings with dedupe on LUNs" is simply untrue and also based on old information - something we are actively trying to clear up as we get the latest info out to people like you and others in our community. So take a look at the docs below and let us know if this helps out-
2008-09-26 06:18 AM
Thanks for the articles. We are looking to use E in the documentation, that is to use thin provisioning and returning the space to the aggregate however we had an engineer in from our reseller yesterday who warned us against thin provisioning at all costs as he says he would only recommend that to someone who had had a few years netapp experience and it requires constant monitoring aswell as having a risk of data corruption if oversubscribed. We dont plan to use more than (if thin provisioned) 33% of the available space on the filer so Im not sure whether fully agree with his advice.
What are your views on this please?
2008-09-26 08:11 AM
Michael thanks for bringing this to light, and great question wanting to understand how Dedupe and Thin Provisioning work.
In addition to the excellent collateral which Larry provided, some of the reasons I'm sure the engineer who came in said those things are for the following reasons:
Thin Provisioning left unchecked with unknown workloads is mysterious, and in the past when people did not understand it clearly it would require constant checking, in order to validate your workload. (Both from a sizing and growth perspective, as well as from a change perspective depending upon your use of snapshots).
However, with the availability of vol autogrow and snap autodelete, and introducing dedupe into the equation in order to provide you even more space available on disk without encountering an overflow, these fears are unfounded.
So, I agree with your feeling on the decision/advice; however I would take it a step further (as you currently are) and become even more educated on exactly what it is you're working with an implementing (Option E as you've chosen) and make sure you understand how your environment will work out in the end.
I wouldn't suggest a few years of experienced required, merely taking advantage of the collateral at your disposal which combines hundreds of cumulative experience all combined and delivered in bite-sized segments
Good luck with building out your E deployment as you're looking to, and let us know how it goes!