SMB 3.0, MS SQL 2014, PREEMPTIVE_OS_GETFILEATTRIBUTES

borismekler · ‎2014-12-22

Has anyone encountered this weird issue? I have SQL 2014 running on Windows Server 2012 R2, virtualized under ESXi 5.5, with storage on a FAS2552 running cDOT 8.2.2, storing databases on an SMB share. Network is 10GbE, Juniper EX4550F switches, Intel 82599EB NICs, vmxnet3 virtual NICs. Every time SQL needs to perform a file operation such as database attach or a log backup, it gets stuck for about 65 seconds, with activity monitor showing wait type PREEMPTIVE_OS_GETFILEATTRIBUTES. The action never fails, and always takes around 65 seconds to complete. It only happens with shared backed by the NetApp filer, not with Windows Server shares on the same network. Aside from this, performance is great, it's easily doing 100-150MB/s database IO in regular operation (OLTP). However, this issue makes SnapManager operations take a LOT longer than they should.

StewartMilne · ‎2015-02-16

Hi Boris,

We are also experiencing this issue. Backups are taking 10x longer when using an SMB share. We also have Windows Server 2012 R2 and SQL Server 2012 with ESX 5.5 and CDOT 8.2.3.

borismekler · ‎2015-06-10

I finally got a response from support today, it's an ONTAP bug (bug number 833013), scheduled to be fixed in 8.3.1 GA. This is the information I have:

The Ioctl command FSCTL_VALIDATE_NEGOTIATE_INFO is used by some CIFS clients
when communicating with CIFS SMB3 protocol. Clustered DATA ONTAP does not
support the Ioctl at this time. Additionally, the error code returned is not
correct. If the CIFS share is a normal data share, the error code returned is
STATUS_INVALID_DEVICE_REQUEST. If the CIFS share is the IPC$ named pipe share,
the error code returned is STATUS_FILE_CLOSED. The correct error code is
STATUS_NOT_SUPPORTED.

StewartMilne · ‎2015-06-18

Thanks for the update Boris.

We still have a support ticket open with NetApp and have been investigating this issue for several months now. Originally it was believed to be SMB signing causing the problem (bug 826317), but after disabling this on the Windows 2012 R2 server it had no effect. Originally we investigated the issue with Microsoft and disabling SMB 2/3 on the server resolved the issue, however it is not a long-term solution for us as it is not advised to keep this disabled.

I've forwarded this bug number onto our support contact.

Regards

Stewart

borismekler · ‎2016-06-05

For the reference, 8.3.1 GA did not fix this - I opened another case, and eventually (several months ago) was told that a fix for this is targeted for 8.3.2P2, which was finally released last week. This weekend I eagerly deployed 8.3.2P2, but to no avail - the problem didn't go anywhere.

In frustration, I used a Windows Server 2016 Technical Preview 5 VM I installed some time ago for another purpose to deploy an SQL Server 2016 RTM instance, with the storage on the same FAS2552 accessed via SMB 3.0, and lo and behold - when I accessed a backup file on the filer through that instance it did not get stuck. I dropped an SQL 2014 instance on the same WS2016 VM and instantly reproduced the issue. Seeing progress, I deployed another test VM running Windows Server 2012 R2 and put SQL 2016 on it, restored a database, and it behaved same as SQL 2016 on WS2016 - i.e. without issues - indicating that the fix is definitely in SQL version rather than the OS.

SQL 2016 is not listed as supported in the latest version of SnapManager as of yet (7.2.1), but I tried anyway - installed SnapDrive 7.1.3P1 and SMSQL 7.2.1, set up SVM credentials in SnapDrive, verified that it can see the SMB shares, configured SMSQL on the new instance, and was quite surprised to see that it is able to backup, restore, clone, detach and migrate databases without issues.

TL;DR version - after two years, problem is finally resolved in SQL 2016.