VMware Solutions Discussions

Misaligned VM performance impact analysis on NFS datastores ~ 18%



We are in the process of aligning our VMs on our NFS datastores - I read with interest the Netapp doc


outlining the performance impact from the Netapp point of view - but I did not see the impact quantified.

Since the process of alignment currently requires the VM to be down (integrate with storage vMotion please?!)

I decided to try and design a test from the VM's point of view of the impact of aligned vs not-aligned.

The setup (I did this on an otherwise quiesced lab system (Dell 1950 x 2 cluster running vSphere + Netapp 2020 (NFS Datastores)):

1) Take a misaligned Linux VM (as checked by mbrscan)

2) clone the VM

3) align the clone with mbralign

Now we have two linux VMs M(isaligned) and (A)ligned

I wanted a way to generate IO of varying sizes - I used this script:

[fcocquyt@lab-vm-01 ~]$ more generateIO.csh


set x=1
set bs=1024

while ( $bs < 9000 )
    echo $bs
while ( $x < 20 )
    dd if=/dev/zero of=tstfile$x  bs=$bs  count=10240
    sum tstfile$x
    @ x++
rm tstfile*
@ bs+=1024
set x = 1

What I found from repeated runs of this script on both M and A vms was the Misaligned VM took an average of 18% longer to run the same IO.

I also captured /usr/lib/vmware/bin/vscsiStats - but interestingly those numbers (latency and outStandingIOs for example) did not show the same result (it showed about the same average latency for M & A vms...

I welcome any and all comments on this analysis

One area: block size - I have a suspicion the blocksize has a big effect on the latency - while the script was stepping through the blocksizes I observed the throughput varying quite a bit.

But the finding of 18% impact is in line with my expectation for NFS datastores...




Very handy info -- thanks much for posting.

I generally tell people that you have a certain performance "ceiling" based on the filer head and/or # of spindles (more driven by spindle count usually). Misalignment just means that you'll hit that "ceiling" sooner than you would otherwise. Until you hit that ceiling you won't see a huge difference (although 18% is higher than I would have thought.....very good to know). Once you do hit that ceiling, it's the same as when you max out your backend disk I/O under any circumstances (i.e. things get very slow)....you're just going to get there faster than otherwise due to the impact of misalignment.


Turns out the impact of misaligned VMs can be more dramatic if the level of unaligned IO tips ONTAP into synchronous mode.

One indicator is the pw.over_limit stat -

ref: TR-3593.pdf

1.4 Counters that indicate Improper Alignment
There are various ways of determining if you do not have proper alignment. Using perfstat counters, under the wafl_susp section, “wp.partial_writes“, “pw.over_limit“, and “pw.async_read“ are indicators of improper alignment. The “wp.partial write“ is the block counter of unaligned I/O. If more than a small number of partial writes happen, then WAFL® will launch a background read. These are counted in “pw.async_read“; “pw.over_limit“ is the block counter of the writes waiting on disk reads.

This counter is not exposed via SNMP as standard, but can be trended as outlined here: