Data Backup and Recovery

DSM error prior to SQL cluster reboot

waynehapu
10,226 Views

Hi All,

Two nights ago one of our production SQL servers crashed and was rebooted. The only errors in the Windows log files around the time that it crashed are for OntapDSM and MPIO. The errors relate to a multipathing and missing disk/lun (DSM ID: 0300041e) but i am having trouble identifying just which LUN this relates to?

I cannot resolve the DSM ID in the error logs to any drive on the server or LUN on our filers.

DSM ID: 0300041e - seems an odd ID # as the other DSM ID #s in the output of the "dsmcli path list" command shows the them to end in a number and not a letter.

Example:
C:\Users\srvspeback>dsmcli path list

Path Info for W-XMJJ/ZolmA:
Number of Paths: 4
DSM ID        NexusID       Initiator Address               Target Portal
====== ======= ================= =============
03000500 03000502 21:00:00:24:ff:03:b8:39 50:0a:09:84:99:cb:9e:85 Slot:v.0a
03000400 03000401 21:00:00:24:ff:03:b8:39 50:0a:09:84:89:cb:9e:85 Slot:0a
02000500 02000502 21:00:00:24:ff:03:b7:ab 50:0a:09:83:99:cb:9e:85 Slot:v.0c
02000400 02000401 21:00:00:24:ff:03:b7:ab 50:0a:09:83:89:cb:9e:85 Slot:0c

=====================================================================================================

Platforms:

SQL server Win2K8 R2

DSM version is 3.3.25186

LUNs via FC (snapdrive 6.3)

Data Ontap 7.3.4 (FAS3140)

=====================================================================================================

Any help would be much appreciated - Thanks in advance.

Kind Regards

Wayne H

     ==========================================================

The errors in the Windows Event logs are as follows:

The computer has rebooted from a bugcheck. The bugcheck was: 0x000000d1 (0x0000000000000000, 0x0000000000000002, 0x0000000000000008, 0x0000000000000000). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 020111-28922-01.
All paths have failed. \Device\MPIODisk118 will be removed.
ONTAP reported that the LUN on DSM ID 0300041e is not supported. The data section of this log entry contains additional information.
DSM ID 0300041e has initiated a fail-over.
All paths have failed. \Device\MPIODisk118 will be removed.


Log Name: System
Source: mpio
Date: 1/02/2011 7:11:50 PM
Event
ID: 16
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: hppewscl2xx.xx

Description:
A fail-over on \Device\MPIODisk118 occurred.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="mpio" />
<EventID Qualifiers="49160">16</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-02-01T08:11:50.443267400Z" />
<EventRecordID>47381</EventRecordID>
<Channel>System</Channel>
<Computer>hppewscl2xx.xx</Computer>
<Security />
</System>
<EventData>
<Data>\Device\MPIODisk118</Data>
<Binary>000008000100000000000000100008C00200000000000000000000000000000000000000000000000104000300000000</Binary>
</EventData>
</Event>

Log Name: System
Source: mpio
Date: 1/02/2011 7:11:50 PM
Event ID: 23
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: hppewscl2xx.xx

Description:
All paths have failed. \Device\MPIODisk118 will be removed.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="mpio" />
<EventID Qualifiers="49160">23</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-02-01T08:11:50.443267400Z" />
<EventRecordID>47380</EventRecordID>
<Channel>System</Channel>
<Computer>hppewscl2xx.xx</Computer>
<Security />
</System>
<EventData>
<Data>\Device\MPIODisk118</Data>
<Binary>000000000100000000000000170008C0170000000E0000C000000000000000000000000000000000</Binary>
</EventData>
</Event>

Log Name: System
Source: ontapdsm
Date: 1/02/2011 7:11:50 PM
Event ID: 61077
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: hppewscl2xx.xx

Description:
DSM ID 0300041e has initiated a fail-
over.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="ontapdsm" />
<EventID Qualifiers="33024">61077</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-02-01T08:11:50.256067400Z" />
<EventRecordID>47379</EventRecordID>
<Channel>System</Channel>
<Computer>hppewscl2xx.xx</Computer>
<Security />
</System>
<EventData>
<Data>
</Data>
<Data>0300041e</Data>
<Binary>0F002C00020054000000000095EE00810400000000000000000000000000000000000000000000001E04000301040003850100C00A0520000000840205250000000000000000000000000000000000007200FFFF</Binary>
</EventData>
</Event>

Log Name: System
Source: ontapdsm
Date: 1/02/2011 7:11:50 PM
Event ID: 61085
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: hppewscl2xx.xx

Description:
ONTAP reported that the LUN on DSM ID 0300041e is not supported. The data section of this log entry contains additional information.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="ontapdsm" />
<EventID Qualifiers="49408">61085</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-02-01T08:11:50.256067400Z" />
<EventRecordID>47378</EventRecordID>
<Channel>System</Channel>
<Computer>hppewscl2xx.xx</Computer>
<Security />
</System>
<EventData>
<Data>
</Data>
<Data>0300041e</Data>
<Binary>0F002C0002005400000000009DEE00C10400000000000000000000000000000000000000000000001E04000301040003850100C00A0520000000840205250000000000000000000000000000000000007200FFFF</Binary>
</EventData>
</Event>


10 REPLIES 10
Public