ONTAP Discussions

A sensor reported a fault on 2d shelf 11 (with error: 20)

VenkataSunil

Hi,

 

We are noticing an error on SAS Shelf for bay 20, the disk is missing.

 

Shelf Status:

 

Loop 2d announced error in header
A sensor reported a fault on 2d shelf 11 (with error: 20)
Environment for channel 2d
	Number of shelves monitored: 1	enabled: yes
	Environmental failure on shelves on this channel? yes
Shelf bays with disk devices installed:
	  23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
	  with error: 20

We see the following log events

 

 params: {'debug_string': 'BAD/WRONG destination on OPEN (0x17) -- delaying: dev 2b.11.20, cdb 0x2f:40931580:0480 (0/0/23521291), NDU 0x0', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:25 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'Device 2d.11.20 invalidate debounce - 40', 'adapterName': '2c'}
[?] Sat Aug 05 02:41:25 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'Asyncd device scan done -- SATA 0/0, SATA reserved 0/0.', 'adapterName': '2c'}
[?] Sat Aug 05 02:41:25 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'Device 2b.11.20 invalidate debounce - 40', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:25 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'Asyncd device scan done -- SATA 0/0, SATA reserved 0/0.', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'Device 2d.11.20 is present and powered up but is about to be invalidated (20) -- power cycling.', 'adapterName': '2c'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'Starting powercycle on device 2d.11.20', 'adapterName': '2c'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'PHY POWER CYCLE already in progress (WWN 5:0050cc:103378c:3f, phy 28) -- aborting', 'adapterName': '2c'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'Powercycle on device 2d.11.20 complete: status 0', 'adapterName': '2c'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'Device 2b.11.20 is present and powered up but is about to be invalidated (20) -- power cycling.', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'Starting powercycle on device 2b.11.20', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'PHY POWER CYCLE already in progress (WWN 5:0050cc:103379e:3f, phy 28) -- aborting', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:45 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'Powercycle on device 2b.11.20 complete: status 0', 'adapterName': '2a'}
[?] Sat Aug 05 02:41:47 IST [napmaidp01: ses_admin: ses.status.driveWarning:debug]: A non-critical event has been detected on drive 20 on DS4243 shelf 2d.11; non-critical.
[?] Sat Aug 05 02:41:57 IST [napmaidp01: ses_admin: ses.status.driveOk:info]: The error on drive 20 on DS4243 shelf 2d.11 has been corrected.
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'Device 2d.11.20 invalidated.', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'Invalidating device 2d.11.20. ', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_0: scsi.cmd.selectionTimeout:error]: Disk device 2d.11.20: Adapter/target error: HA status 0x7: cdb 0x2f:40931580:0480. Targeted device did not respond to requested I/O. I/O will be retried.
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'Device 2b.11.20 invalidated.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'Invalidating device 2b.11.20. ', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_1: scsi.cmd.selectionTimeout:error]: Disk device 2b.11.20: Adapter/target error: HA status 0x7: cdb 0x2f:40931580:0480. Targeted device did not respond to requested I/O. I/O will be retried.
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_1: disk.ioFailed:error]: params: {'deviceName': '2b.11.20', 'returnCode': '2', 'pathRetryCount': '0', 'adapterStatus': '0xd', 'cdb': '0x5e:01', 'basicTimeout': '10', 'iASCQ': '0x0', 'iSenseKey': '0x0', 'sSenseCode': '', 'ETime': '127', 'iASC': '0x0', 'victimRetryCount': '0', 'sSenseKey': 'SCSI:no sense', 'targetStatus': '0x0', 'retryCount': '0', 'pathsTried': '0', 'timeoutRetryCount': '0'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_1: ems.engine.event.tooBig:warning]: params: {'lastSize': '1128', 'averageSize': '1122', 'emsId': 'disk.ioFailed'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_timeout_1: disk.reserveFailed:error]: Disk reservation failed on 2b.11.20 CDB 0x5e:01 - SCSI:no sense (0 0 0)
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_send_admin: scsi.cmd.selectionTimeout:error]: Disk device ?:?.?: Adapter/target error: HA status 0x7: cdb 0x2f:40931580:0480. Targeted device did not respond to requested I/O. I/O will be retried.
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_send_admin: disk.ioFailed:error]: params: {'deviceName': '2b.11.20', 'returnCode': '2', 'pathRetryCount': '0', 'adapterStatus': '0xd', 'cdb': '0x2f:40931580:0480', 'basicTimeout': '10', 'iASCQ': '0x0', 'iSenseKey': '0x0', 'sSenseCode': '', 'ETime': '41123', 'iASC': '0x0', 'victimRetryCount': '79', 'sSenseKey': 'SCSI:no sense', 'targetStatus': '0x0', 'retryCount': '2', 'pathsTried': '1', 'timeoutRetryCount': '0'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: pmcsas_send_admin: ems.engine.event.tooBig:warning]: params: {'lastSize': '1144', 'averageSize': '1122', 'emsId': 'disk.ioFailed'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: sanown_io: diskown.errorDuringIO:error]: error 25 (no valid path to disk) on disk 2b.11.20 (S/N WD-WCAW31217364) while setting disk reservation
[?] Sat Aug 05 02:42:05 IST [napmaidp01: raidio_thread: raid.spares.media_scrub.suspend:notice]: params: {'disk_rpm': '7200', 'vendor': 'NETAPP  ', 'dbn': '120375552', 'firmware_revision': 'NA04', 'shelf': '11', 'disk_info': 'Disk 2b.11.20 Shelf 11 Bay 20 [NETAPP   X302_WVULC01TSSM NA04] S/N [WD-WCAW31217364]', 'bay': '20', 'serialno': 'WD-WCAW31217364', 'owner': '', 'percentage': '55', 'disk_type': '8', 'model': 'X302_WVULC01TSSM'}
[?] Sat Aug 05 02:42:05 IST [napmaidp01: config_thread: raid.config.spare.disk.missing:info]: Spare Disk 2b.11.20 Shelf 11 Bay 20 [NETAPP   X302_WVULC01TSSM NA04] S/N [WD-WCAW31217364] is missing.
[?] Sat Aug 05 02:42:12 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'phy 28 on expander 5:0050cc:103378c:3f is in state 0 but dongle is present and powered up -- initiating PHY reset.', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:12 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'One or more (1) PHYs on expander 5:0050cc:103378c:3f are in a bad state.', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:12 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'phy 28 on expander 5:0050cc:103379e:3f is in state 0 but dongle is present and powered up -- initiating PHY reset.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:12 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'One or more (1) PHYs on expander 5:0050cc:103379e:3f are in a bad state.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:15 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'Asyncd device scan done -- SATA 0/0, SATA reserved 0/0.', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:15 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'Asyncd device scan done -- SATA 0/0, SATA reserved 0/0.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:20 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'phy 28 on expander 5:0050cc:103378c:3f is in state 0 but dongle is present and powered up -- initiating powercycle.', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:20 IST [napmaidp01: pmcsas_timeout_0: sas.adapter.debug:info]: params: {'debug_string': 'One or more (1) PHYs on expander 5:0050cc:103378c:3f are in a bad state.', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:20 IST [napmaidp01: pmcsas_asyncd_0: sas.adapter.debug:info]: params: {'debug_string': 'PHY POWER CYCLE already in progress (WWN 5:0050cc:103378c:3f, phy 28) -- aborting', 'adapterName': '2c'}
[?] Sat Aug 05 02:42:20 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'phy 28 on expander 5:0050cc:103379e:3f is in state 0 but dongle is present and powered up -- initiating powercycle.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:20 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'One or more (1) PHYs on expander 5:0050cc:103379e:3f are in a bad state.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:20 IST [napmaidp01: pmcsas_asyncd_1: sas.adapter.debug:info]: params: {'debug_string': 'PHY POWER CYCLE already in progress (WWN 5:0050cc:103379e:3f, phy 28) -- aborting', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:28 IST [napmaidp01: pmcsas_timeout_1: sas.adapter.debug:info]: params: {'debug_string': 'One or more (1) PHYs on expander 5:0050cc:103379e:3f are in a bad state.', 'adapterName': '2a'}
[?] Sat Aug 05 02:42:45 IST [napmaidp01: emslog_main: ems.log.duplicate:info]: params: {'id': '1478361542/513545', 'numDups': '1'}
[?] Sat Aug 05 02:42:52 IST [napmaidp01: asup_main: cmds.sysconf.validDebug:debug]: sysconfig: Validating configuration.
[?] Sat Aug 05 02:43:40 IST [napmaidp01: ses_admin: ses.status.driveWarning:debug]: A non-critical event has been detected on drive 20 on DS4243 shelf 2d.11; non-critical.
[?] Sat Aug 05 02:45:58 IST [napmaidp01: asup_main: api.fileio:warning]: params: {'errorDescription': 'Removing iterator file that is too old', 'errorCode': '22', 'errorDetail': 'Invalid argument', 'targetDescription': '/etc/.zapi/56442973935054481.next'}

disk is of model X302_WVULC01TSSM and DQP is of datecode 20160701.

 

Any inputs on the above will be a great help for us.

0 REPLIES 0
Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

Public