ONTAP Hardware

Netapp FAS2552 and VMware 5.1 block mode LUN size and dedupe best practice


We are currently migrating to a FAS2552 (approx. 150 VM 5.1 guest machines from windows  2003 to 2012), on our previous SAN LUNS were allocated on  a per raid group basis i.e one created the raid group, then LUN then made available to Vmware formatted with VMFS etc.  LUN sizes were largely dictated for you, the Netapp is somewhat different with the aggregate, flexvol / lun configuration.


 My query is with regard to LUN size, currently we are thin provisioning both the LUN creation and VM vmdk's and grouping like Operating systems on the same Lun's to optimise dedupe, we are also allocating only one lun per Flexvol.  So far I have stuck to LUNs up to 2TB and added no more than 10 guest machines, I have also configured various alerts both within Netapp and Vmware to indicate when the lun / datastore is either allocated / provisioned more than 100% or is 90% full in Vmware or growing quickly in Netapp.  One concern is to get the best out of dedupe you effectively have to over allocate your luns / Vols i.e


What are the considerations with regard to LUN sizes e.g. we have two large file / media servers for two companies on the face of it one would think placing them both on the same large lun say 6TB would allow for max dedupe (as I believe dedupe is per  flexvol).  


We also have a number of SQL servers and LUN size aside we are not sure if we should enable dedup or not for luns / vols containing SQl databases.


Hence our queries are :

1) What is the best practice on LUN sizes with a FAS 2552 or what are the considerations?

2) Should depude be used on luns / vols containing SQL databases and if so are there any considerations?


Thanks for any contributions - apologies if this has been covered previously.




well... there're two side of the medal


if you really won't to have the best deduplication ratio you could make big volumes and place all luns within this volume - as you already know deduplication works on volume base - so dedup would be best now - but the problem is - having all luns within one volume if you create a snapshot on that netapp volume you would also have all luns snapshoted - so you loose granularity.

the other side:

if you place every lun in it's own volume - you'll have the granularity for snapshoting but you loose dedup - since everything within one wolume will be deduped but not across volumes. also in case you have to restore a complete datastore this setup makes it much faster to restore (since less data), if you have a moden backup software which restores single VMs you can forget about that.


so there isn't really an answer what's best - it just depends


e.g. one of my costumers placed all C:-drive vmdk of it's virtual maschines within one datastore - since C: contain the OS and OS is always the same - he has a geat dedup ratio - the "data"-drives, where the data or programs itselfs resides are on a diffrent datastore - also here - great dedup, but the backside - your VMs are splitted over multiple datastores


if i have a costumer who wants to thinprovision everything i recommend

- one thinprovisioned volume for each thinprovisioned lun - starting the lun size at about 2TB like you already mentioned

having such big vmdks i would create a big datastore with 8-10TB - since vmfs5 doesn't have that strict maxsize anymore (64tb total datastore size) it makes most sense.

having everything thinprovisioned mostly comes with overcommiting the aggregate so monitoring the aggregates freespace is mandatory 


having sql luns deduped? it would give it a try - if you have multiple db-files on that luns (e.g. splitted db with more db-files for performance) this could make sense - i would enable it and see the dedup ratio - if it's less than 5-10% i would disable except having flash pool in use - for that case dedup could be interesting again - i think it wouldn't make sense if you have a lot of data permantenlty flushed and the database is reorganized regulary - so it also depends on the data in your database -

my rules for DBs- more static dbs: yes; more dynamic: no


dedup is just a job which runs at around midnight and the schedule should be costumized in my opinion not having all volumes starting the same time - so the impact to the system is very easy to isolate in cases you have performance issues.


hope that helped a bit for your decisions


Hi Honig2012,


Some good points for me to consider.


We do have a number of realatively static Db's and some very active one's so will potentailly disable dedupe on the the more active DB's


Thanks for the response ....


We actually run dedup on both MS SQL db's as well as our Oracle DB's, and havent seen any major issues with it. Our DB's are hyperactive and handles massive amounts of data (10-2000TB). So I wouldnt be to worried about dedupe on netapp where databases are concerned. Its worth mentioning that on some databases we see massive savings on dedupe, while on others its minor, but the value is well worth it.