Subscribe
Accepted Solution

Lun aligment with Solaris ZFS

Could anyone help me with this issue.

Currently we are suffering from low performance during some peak times during comercial hours.

Searching for a cause, I started a Perfstat and found in aligment session that all my Solaris ZFS luns shows a behaviour of spread read and writes IOps in all buckets, and all my Windows Hosts are hitting 80-100% os read/writes ops in bucket 4 or 7 even with partitions offset divisible to 4096.

Dont know what to do...

Solaris ZFS example :

lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.0:12%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.1:11%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.2:11%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.3:9%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.4:13%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.5:14%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.6:11%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_align_histo.7:13%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.0:23%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.1:11%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.2:4%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.3:7%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.4:2%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.5:6%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.6:17%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_align_histo.7:4%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:read_partial_blocks:0%
lun:/vol/corretorbancos/bancos-W-DJKoXT6c8S:write_partial_blocks:20%

Windows Host SQL Server example :

lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.0:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.1:100%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.2:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.3:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.4:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.5:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.6:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_align_histo.7:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.0:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.1:99%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.2:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.3:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.4:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.5:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.6:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_align_histo.7:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:read_partial_blocks:0%
lun:/vol/teamworks/db-W-DJKoX2Y8N1:write_partial_blocks:0%

Need help

Re: Lun aligment with Solaris ZFS

I have a customer with the same problem. The alignment is correct, the starting offset is correct. The problem seem that ZFS randomly changes the block sizes throughout its work from as small as 512Bytes on upward.  This issue came into play starting with a certain Solaris 10 Update 7 (Update 3 and 5 show different behaviour).  There is no fix for this that I am aware of beyond not using ZFS if using the Solaris Versions
beyond 10 Update 7. You might wanna check an ZFS expert if this is a bug in ZFS, since this is affecting the performance of any array. Also, changing the recordsize doesn't seem to help here.

Re: Lun aligment with Solaris ZFS

Thankyou  cschnidr , its not the best answer I would expect but is an aswer anyway.

Im opening a case with Oracle/Sun to check if there is a way to force ZFS to use 4k blocks instead of ramdomly 512 sectors

Re: Lun aligment with Solaris ZFS

You could try setting zfs recordsize; it is actually recommended if zfs is used for databases with fixed block.

It can be set at runtime but will affect only files created after value had been changed.

Re: Lun aligment with Solaris ZFS

We did set 4k recordsize and run a batch of processs while running perfstat in a second try to fix the problem. It did showed a improved performance, just a bit faster and responsive but didint fix the aligment problem.

It seems that ZFS works with disks as "harded set" 512 bits sectors, regardless it is a real hard drive or a logical lun, and randomize its start sector. That would not be a problem in today hard drives, but with logical luns  that works with 4k blocks.....

Re: Lun aligment with Solaris ZFS

There is a TR for ZFS http://media.netapp.com/documents/tr-3603.pdf that has not been updated in 3 years.  If you followed all the steps in the TR, I would try contacting the author of the article to see if they can provide any further assistance.

Thanks,

Mitchell

Re: Lun aligment with Solaris ZFS

Thanks man, will check immediatly

Re: Lun aligment with Solaris ZFS

It might also help to setup your ZIL so that writes have more time to be aligned prior to being flushed.

Re: Lun aligment with Solaris ZFS

Bullseye :

"

There are two ways that you can provision a disk for use with ZFS.

• VTOC - Use the normal VTOC label and provision a single slice for use with ZFS. Create a slice that encompasses the entire disk and provision that single slice to ZFS.

• EFI – Invoke the advanced "format –e" command and label the disk with an EFI label. Create a slice starting from sector 40 so that it is aligned with WAFL® and provision that slice for use with ZFS.

There is no separate LUN type on the storage array for EFI labels or EFI offsets, so use the "Solaris" LUN type and manually align a slice encompassing the entire disk when using EFI labels. The LUN type "Solaris" is also used for the normal VTOC label.

If we provision the entire disk instead of a slice for use with ZFS, then ZFS will format the disk with an EFI label, which will be unaligned with WAFL. For this reason, it is recommended to use whole "aligned" slices (for EFI labels) that encompass the size of the disk for use with ZFS. We will have to manually align the slice and then provision the slice for use with ZFS if we want to use EFI labels. EFI disks add support for disks greater than 1TB. There is no need to "align" the slice if VTOC labels are used; just create a slice that is the entire size of the disk and provision that slice to ZFS."

Thanks a lot Mitchells, you saved the day !

Re: Lun aligment with Solaris ZFS

The solaris _efi lun type did get added to the filer in 7.3.1.  You just need to make sure that either you implement the offset manually or if the filer implements the offset with the multiprotocol image type.