Misaligned vmdk with offset group 2 causing read latency

dtapper · ‎2016-11-22

Hi,

We are having read latency issues on a particular vmdk data disk in our environment. VSC 6.2.1 Optimization and Migration analyse shows misalignment with offset 2. The server itself is running Windows Server 2012 R2 so misalignment should'nt be any issue as far as I know.

Anyone who knows what offset 2 means and how to correct this? Disk and partition management done by default except NTFS allocation size which is increased from default 4KB to 64KB.

Environment:

Vmware vSphere 6.0

NFS data stores

Windows 2012 R2

Application: MS SQL Server 2012

Disk misaligned = 1100GB in size

Disk type: GPT

NTFS Allocation unit size = 64K

BlockSize Index Name StartingOffset

512 0 Disk #3, Partition #0 135266304

512 0 Disk #2, Partition #0 135266304

512 0 Disk #1, Partition #0 135266304

512 0 Disk #0, Partition #0 1048576

512 1 Disk #0, Partition #1 315621376

512 2 Disk #0, Partition #2 553648128

512 3 Disk #0, Partition #3 419430400

BR,

David T

asulliva · ‎2016-11-22

The misalignment number represents how many 512 byte segments the IO is misaligned from a 4KB WAFL block. So, an offset of 2 means that it's starting at 1KB instead of the start of the block.

VSC corrects for this by creating a volume which is offset in the opposite direction, which effectively realigns the IO. You can also correct it a number of other ways, though none of them are non-disruptive.

As to why this occurred with Server 2012 R2, that's a mystery to me. Was the server upgraded from a previous version?

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

dtapper · ‎2016-11-23

Hi,

Thanks for your reply Andrew. Got the same from Netapp support today:

"A WAFL block is 4K, consisting of 8x 512b buckets marked from 0 to 7. These are also called offsets in this context. So, offset 2 means that you are reading/writing from the 3rd bucket, which means to read/write 4k you need to also merge, read and write the first 3x 512b of the next block. This clearly adds overhead (misalignment) and can cause performance issues."

Found an interesting post regarding alignment and newer versions of Windows/SQL Server: https://blogs.msdn.microsoft.com/jimmymay/2014/03/14/disk-partition-alignment-it-still-matters-dpa-for-windows-server-2012-sql-server-2012-and-sql-ser...

Looks like alignment is an important topic even after Windows Server 2008, especially if you're running MS SQL server with a lot of random reads.

BR,

David

dtapper · ‎2016-11-25

Hi again,

Here you can see the misalignment in VSC 6.2.1.

Disk above is marked with underline in the text below. Default starting offset which is divisible by 4096 (WAFL 4K block size).

Disk type: GPT

NTFS Allocation unit size = 65536

BlockSize Index Name StartingOffset

512 0 Disk #3, Partition #0 135266304

512 0 Disk #2, Partition #0 135266304

512 0 Disk #1, Partition #0 135266304 / 4096 = 33024 (Divisible by 4096)

512 0 Disk #0, Partition #0 1048576

512 1 Disk #0, Partition #1 315621376

512 2 Disk #0, Partition #2 553648128

512 3 Disk #0, Partition #3 419430400

So if the disk/partition is aligned - how can we then get misaligned IO?

NTFS allocation unit size is not default, it's been increased to 64K according to MS SQL server best practices. However, it's default 4K in our test environment and is also misaligned with offset group 2.

Any hint?

BR,

David T