ONTAP Hardware

Zombie Filer

JimMc
2,901 Views

Hi All,

 

I'm trying to bring a FAS3140 which is out of support back from the dead.  The system is an HA pair which is currently in failover as controller 2 has apparently died.

 

I've tried reseating the downed controller.  Just wondering if anyone has any bright ides?  I wondered if it could be one of the PCI cards causing it for example?

 

 

 

events all reveals lots of this:

 

Record 203: Thu Jan  1 00:13:08 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 204: Fri Jul 17 08:24:47 2015 [BIOS.warning]: POST error 0x00fa: ERR_OTHER_WDT_REBOOT_TP49 Additional data: 0x00000000 0x00000000
Record 205: Thu Jan  1 00:13:13 2009 [Agent Event.warning]: FIFO 0x8FFF - Agent 0x51, L1_WD_TIMEOUT asserted.
Record 206: Thu Jan  1 00:13:18 2009 [Agent Event.critical]: FIFO 0x8FFE - Agent 0x51, L2_WD_TIMEOUT asserted.
Record 207: Thu Jan  1 00:13:18 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 208: Thu Jan  1 00:13:19 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 209: Thu Jan  1 00:13:19 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 210: Thu Jan  1 00:13:19 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 211: Thu Jan  1 00:13:20 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 212: Thu Jan  1 00:13:20 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 213: Thu Jan  1 00:13:21 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 214: Thu Jan  1 00:13:30 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 215: Thu Jan  1 00:13:30 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 216: Thu Jan  1 00:13:30 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 217: Thu Jan  1 00:13:30 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 218: Fri Jul 17 08:25:09 2015 [BIOS.warning]: POST error 0x00fa: ERR_OTHER_WDT_REBOOT_TP49 Additional data: 0x00000000 0x00000000

 

...

Record 1171: Fri Jul 17 08:50:05 2015 [BIOS.warning]: POST error 0x00fa: ERR_OTHER_WDT_REBOOT_TP49 Additional data: 0x00000000 0x00000000
Record 1172: Thu Jan  1 00:38:31 2009 [Agent Event.warning]: FIFO 0x8FFF - Agent 0x51, L1_WD_TIMEOUT asserted.
Record 1173: Thu Jan  1 00:38:36 2009 [Agent Event.critical]: FIFO 0x8FFE - Agent 0x51, L2_WD_TIMEOUT asserted.
Record 1174: Thu Jan  1 00:38:36 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1175: Thu Jan  1 00:38:36 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1176: Thu Jan  1 00:38:37 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1177: Thu Jan  1 00:38:37 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1178: Thu Jan  1 00:38:38 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1179: Thu Jan  1 00:38:38 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1180: Thu Jan  1 00:38:39 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1181: Thu Jan  1 00:38:48 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1182: Fri Jul 17 08:50:27 2015 [BIOS.warning]: POST error 0x00fa: ERR_OTHER_WDT_REBOOT_TP49 Additional data: 0x00000000 0x00000000
Record 1183: Thu Jan  1 00:38:48 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 1184: Thu Jan  1 00:38:48 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1185: Thu Jan  1 00:38:48 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 1186: Thu Jan  1 00:38:53 2009 [Agent Event.warning]: FIFO 0x8FFF - Agent 0x51, L1_WD_TIMEOUT asserted.
Record 1187: Thu Jan  1 00:38:58 2009 [Agent Event.critical]: FIFO 0x8FFE - Agent 0x51, L2_WD_TIMEOUT asserted.
Record 1188: Thu Jan  1 00:38:58 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1189: Thu Jan  1 00:38:58 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1190: Thu Jan  1 00:38:59 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1191: Thu Jan  1 00:38:59 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1192: Thu Jan  1 00:39:00 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1193: Thu Jan  1 00:39:00 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1194: Thu Jan  1 00:39:01 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1195: Thu Jan  1 00:39:10 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1196: Thu Jan  1 00:39:10 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 1197: Thu Jan  1 00:39:10 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1198: Thu Jan  1 00:39:10 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 1199: Fri Jul 17 08:50:49 2015 [BIOS.warning]: POST error 0x00fa: ERR_OTHER_WDT_REBOOT_TP49 Additional data: 0x00000000 0x00000000
Record 1200: Thu Jan  1 00:39:15 2009 [Agent Event.warning]: FIFO 0x8FFF - Agent 0x51, L1_WD_TIMEOUT asserted.
Record 1201: Thu Jan  1 00:39:20 2009 [Agent Event.critical]: FIFO 0x8FFE - Agent 0x51, L2_WD_TIMEOUT asserted.
Record 1202: Thu Jan  1 00:39:20 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1203: Thu Jan  1 00:39:20 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1204: Thu Jan  1 00:39:21 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1205: Thu Jan  1 00:39:21 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1206: Thu Jan  1 00:39:22 2009 [Agent Event.warning]: FIFO 0x8003 - Agent 0x51, P0_PCIRST asserted.
Record 1207: Thu Jan  1 00:39:22 2009 [Agent Event.normal]: FIFO 0x0003 - Agent 0x51, P0_PCIRST deasserted.
Record 1208: Thu Jan  1 00:39:23 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1209: Thu Jan  1 00:39:32 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1210: Fri Jul 17 08:51:11 2015 [BIOS.warning]: POST error 0x00fa: ERR_OTHER_WDT_REBOOT_TP49 Additional data: 0x00000000 0x00000000
Record 1211: Thu Jan  1 00:39:32 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 1212: Thu Jan  1 00:39:32 2009 [Agent Event.notice]: FIFO 0x202A - Agent 0x51, Appliance command 0x2a (disable watchdog)
Record 1213: Thu Jan  1 00:39:32 2009 [Agent Event.notice]: FIFO 0x2029 - Agent 0x51, Appliance command 0x29 (enable watchdog)
Record 1214: Thu Jan  1 00:39:37 2009 [Agent Event.warning]: FIFO 0x8FFF - Agent 0x51, L1_WD_TIMEOUT asserted.

 

system log shows lots of this:

 

================ Log #1 start time Thu Jan  1 00:00:41 1970
▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
================ Log #1 end time Thu Jan  1 00:37:42 2009

 

system sensors shows:

 

system sensors
Sensor  Sensor          Sensor  Current
ID      Name            State   Value
======  ========        ======  =======
0x001   PS1_G_IN_OK     good            A
0x002   PS1_12V_OK      good            A
0x003   P0_PCIRST       inactive        D
0x004   PS2_G_IN_OK     good            A
0x005   PS2_12V_OK      good            A
0x006   HT1000_PWOK     good            A
0x007   ADM1026_THERM   inactive        D
0x008   ADM1026_INT     inactive        D
0x009   PWRSEQ_3_GD_A   good            A
0x00a   PWRSEQ_3_GD_B   good            A
0x00b   PWRSEQ_3_GD_C   good            A
0x00c   PWRSEQ_5_GD_A   good            A
0x00d   PWRSEQ_5_GD_B   good            A
0x00e   PWRSEQ_6_GD     good            A
0x00f   PWRSEQ_7_GD_A   good            A
0x010   PWRSEQ_7_GD_B   inactive        D
0x011   PWRSEQ_8_GD     good            A
0x012   LM77_CENTRAL    inactive        D
0x013   LM77_FRONT_CPU  inactive        D
0x014   LM77_FRONT_PCI  inactive        D
0x015   LM77_REAR_PCI   inactive        D
0x016   PWRSEQ_2_GD     good            A
0x017   LM77_REAR_CPU   inactive        D
0x018   BTN_IN          inactive        D
0x019   DEB_PS2_PRSNT   good            A
0x01a   DEB_PS1_PRSNT   good            A
0x01b   FAN_ALERT2      inactive        D
0x01c   FAN_ALERT1      inactive        D
0x01d   12V_PG          good            A
0x01e   PARTNER_PRESENT present         A
0x01f   PS2_INT         good            D
0x020   PS1_INT         good            D

 

1 REPLY 1

Darkstar
2,627 Views

Does the filer even boot? When you connect to the SP and type "system console", do you get into the bootloader? if so, try "autoboot" and post the results here. If you get no response at all (i.e. only the "press ctrl-d to return to SP" or whatever it says) and it doesn't react to ctrl-c then it's probably busted. But if you get a bootloader prompt or even some boot messages that could help a lot in diagnosing.

 

You could also try the serial console instead of the SP, in case it's just the SP that is busted. Try connecting via serial and powering up the system. You should see boot messages from the BIOS and then get a bootloader prompt

Public