SMSQL timeout mount errors on Win2003R2x64SP2

simonengelbert · ‎2010-09-08

We have a daily process that takes a snapshot of multiple databases on server A and then mounts those flexclones on Server B. The snapshot process on server A kicks off at a designated time and once it completes it starts the process on server B. The process on server B detaches the current databases, disconnects those drives, deletes the old flexclones, grabs the new snapshot, creates flexclones, mounts the new flexclones & drives up to the server and then re-attaches the databases. The process on server B is handled by a perl script that utilizes the SnapDrive/NetApp API. This all should work great but we end up having issues with server B almost weekly and to the point where we have to reboot it. The perl script and API calls become unresponsive and can't mount up the flexclones. We end up with errors like this....

ServerB: Checking input parameters

ServerB : Checking access control

ServerB : Checking policies

ServerB : Turning on space reservation

ServerB : Connecting to the LUN

Unable to connect to the LUN

Error: A timeout of 120 secs elapsed while waiting for volume arrival notification from the operating system.

Can't spawn "sdcli disk connect -m ServerB -p 172.xx.xx.xx:/vol/nb1serverbdb/lun01 -d W:\ -IG ServerB ServerB -dtype dedicated"

Server B seems to become more and more unresponsive as we try the process/script to the point where it hangs on even trying to execute the script. At that point we reboot the server and everything runs perfectly. I am confused why Server B is becoming unresponsive and what could be causing it

Let me give some details about the OS and software we are running. Both server A and B are running same software.

Windows Server 2003 R2 64-Bit SP2
SQL Server 2005 SP2
SnapDrive 6.2.0.4519
NetApp Windows Host Utilities 5.2.3297.2229
Snap Manager for SQL Server 5.0
Data ONTAP DSM for Windows MPIO 3.3.3298.1305
NetApp 7.3.3P2

Is anyone else experiencing these same issues and/or have any ides what could be causing our problems?

simonengelbert · ‎2010-11-18

UPDATE: We upgraded SnapDrive on the Windows 2003 server to 6.3.0.4601 and that has helped tremendously. We are still having issues but the need to reboot the server is rare now