De-Dup and Linux VMware VM's

jasonczerak · ‎2010-04-23

So, I have this interesting question. Currently I use Spacewalk/KickStart to build my Linux VM's. It's easier to deal with change then some sort of Vmware Template.

So, we are deploying our new NFS datastores for ESX4i. I kicked off 10 idential VM builds on a volume with 3 other older linux VM's and 2 windows VM's.So In theory, Everything is identical (cpu, ram, disk size packages, patches, configuration) with exception to server name and IP, ssh key, spacwalk UUID, log messages (name difference), Should have a few K of differences and that is i right?

Yet, on a data store with these 10 VM's, 2 older ones with 7G of useage (each )but with maybe all but 10 packages different. I'm at 50% de-duplication.

On the guest, df -h = 3.3g....... On the NFS volume, they varye, 3.9 to 5.9 gig used....

Spacewalk reports ideentical packages. Top numbers are with in 1 K on the memory used!!!

I'm expecting the dedup with this small "identical" data set to be near 80-90% right? (just guessing, but more then 50%!)

So, I'm assuming here ext3 is just randomly stringing data long on the file system for some reason. Is this what others are seeing?

keitha · ‎2010-04-23

Just thinking you might also want to check to be sure you ran a "sis -s start" not just a "sis start" as you would then just look at the data since you turned on dedupe. you might have missed some of the original VMs. Also be sure there are no snapshots in the volume as that will limit the dedupe savings as well until they rotate out.

jasonczerak · ‎2010-04-23

Did all that. Forced a sis start manually too.

I'm going to tar up 2 of the VM's off to an NFS share, format, and un-tar to see if a more squential write will gain some de-dup back for giggles. If the VM's break, who cares I'll re-image them. I got a few days here to toy around.

amiller_1 · ‎2010-04-23

Hmm....are you using thin provisioning at the vSphere level (i.e. vmdk)? If not, that might be an option to see if ext3 is putting odd bits of data into its free space.

jasonczerak · ‎2010-04-26

Yep, according to the VMware guy.

here's a ls tho of the nfs mount.

[root@owbsljputl01:owbsljierp01]$ ls
total 6.3G
drwxr-xr-x 2 root root 4.0K Apr 23 09:59 .
drwxr-xr-x 14 root root 4.0K Apr 23 10:03 ..
-rwxrwxr-x 1 root root 84 Apr 26 08:47 .lck-6ead610000000000
-rw------- 1 root root 10G Apr 26 08:47 owbsljierp01-flat.vmdk
-rw------- 1 root root 8.5K Apr 23 09:59 owbsljierp01.nvram
-rw------- 1 root root 526 Apr 23 09:59 owbsljierp01.vmdk
-rw------- 1 root root 0 Apr 23 09:57 owbsljierp01.vmsd
-rwxr-xr-x 1 root root 3.2K Apr 25 04:05 owbsljierp01.vmx
-rw------- 1 root root 267 Apr 25 04:05 owbsljierp01.vmxf
-rw-r--r-- 1 root root 156K Apr 23 09:59 vmware-0.log
-rw-r--r-- 1 root root 159K Apr 23 09:57 vmware-1.log
-rw-r--r-- 1 root root 248K Apr 23 09:57 vmware-2.log
-rw-r--r-- 1 root root 144K Apr 23 09:57 vmware-3.log
-rw-r--r-- 1 root root 144K Apr 25 04:05 vmware.log
[root@owbsljputl01:owbsljierp01]$

The total suggests thin. The file size doesn't, but it is.

Just odd.

rickymartin · ‎2010-05-23

I've seen this kind of thing before when the VM Image wasnt aligned properly. If so, you should try to address this as soon as you can, you'll find that under heavy I/O, unaligned writes gets throttled and your performance will probably be less than you'd like. I saw a seperate post from you regarding some write performance issues you'd seen, which I also suspect might be related to unaligned I/O.

jasonczerak · ‎2010-05-24

See, none of us figured new NFS file systems and new VMware data stores would need to be aligned! the vmdk is placed on nfs aligned, there for the data within is.

the problem we have now is, 100's of windows and a handful (40-ish) of linux vm's that were once on FC luns + vmfs that need to be be storage vmotioned over nfs will be un-aligned....

I'm going to be experimenting with this de-dup deal here shortly with aligned vm's vs not so aligned vm's.

chriskranz · ‎2010-05-24

The VMDK is still a virtualised filesystem ontop of another filesystem (WAFL), so you need to make sure that this gets aligned correctly. I've never understood why misaligned VMs would affect dedupe so much, surely if they are all deployed the same, they are all misaligned the same, so they would all dedupe the same? But real experience seems to show that it can have quite a big impact with dedupe.

Not sure how you would go about fixing this if you are not using VM templates though. Usuaully we would block align a VM template and use that for all deployments. If you are using a kickstart, maybe you can have an aligned disk that you clone for each new machine rather than creating a new disk each time?

This is a pain, and I have helped many customers get their infrastructure into alignment (currently doing a VM View template right now!), but it's not unique to NetApp or even VMware. Blame Microsoft and Linux for assuming that everyone is still using single physical disks and making partition offsets based and sectors and cylinders! How very 90's!!!

Sorry you also have this pain, hope you can get it sorted though!