ONTAP Hardware
ONTAP Hardware
Hi everyone,
We have a 2 node Metrocluster configuration with two FAS3140. Storage Configuration is Multi-Path HA. One of the nodes crashed and a failover happend. Unfortunately the node that crashed fails to boot and is in a Boot-Loop. RLM on the crashed node still works and this is the output if I login and try to boot it manually:
login as: naroot naroot@SV01rlm's password: RLM SV01> system console Type Ctrl-D to exit. LOADER-A> version Variable Name Value -------------------- -------------------------------------------------- BIOS_VERSION 4.4.0 LOADER_VERSION 1.8 LOADER-A> ifconfig Network interface has not been configured LOADER-A> show devices Device Name Description ----------- --------------------------------------------------------- uart0a NS16550 UART at 0x3F8 vidcons0a Video and Keyboard Console rlm0a Remote LAN Module (RLM): Console at 0x3F8, RLM at 0x2F8 clock0a ISA RTC at 0x70 (index) and 0x71 (target) ide0.0 STEC NACF1GM1U-B11, Sectors: 2001888 (977 MB) at I/O 01F0 e0M BCM5703C Ethernet at 0xFDF00000 (00-A0-98-23-20-76) e0a BCM5715 Ethernet at 0xFE510000 (00-A0-98-23-20-74) e0b BCM5715 Ethernet at 0xFE530000 (00-A0-98-23-20-75) LOADER-A> printenv Variable Name Value -------------------- -------------------------------------------------- CPU_NUM_CORES 2 BOOT_CONSOLE rlm0a BIOS_VERSION 4.4.0 BIOS_DATE 07/24/2010 SYS_MODEL FAS3140 SYS_REV B0 SYS_SERIAL_NUM xxx MOBO_MODEL 1 MOBO_REV A4 MOBO_SERIAL_NUM 741755 CPU_SPEED 2400 CPU_TYPE Opteron savenv saveenv ENV_VERSION 1 fmmbx-lkg-0 16F5383011DF970EA0008CBB7620239806A1D5A6000000000C0040209884D7CA000000000000000000000000000000000000 0000000000000000000000000000240000204E8139B600000000000000000000000000000000000000000000000000000000 00000000 fmmbx-lkg-0b 16F5383011DF970EA0008CBB7620239806A1D5A60000000052B400204EC9CA53000000000000000000000000000000000000 000000000000000000000000000024000020FD7F39B600000000000000000000000000000000000000000000000000000000 00000000 NVRAM_CLEAN false fc-port-0b 9 fc-port-0d 9 last-OS-booted-raid-ver 11 last-OS-booted-wafl-ver 22331 last-OS-booted-ver 8.1.4P1 REBOOT_REASON REBOOT_GIVEBACK partner-sysid xxx fmmbx-lkg-1 3854498011DF91AAA00024879A222398015319770000000024000020A8E04CB6000000000000000000000000000000000000 00000000000000000000000000002400002027114DB600000000000000000000000000000000000000000000000000000000 00000000 fmmbx-lkg-1b 3854498011DF91AAA00024879A222398015319770000000024000020D17F39B6000000000000000000000000000000000000 000000000000000000000000000024000020D1A049B600000000000000000000000000000000000000000000000000000000 00000000 fud_in_progress false USE_SECONDARY true LOADER_VERSION 1.8 ARCH x86_64 BOARDNAME SB_XV PRIMARY_KERNEL_URL fat://ide0.0/x86_64/kernel/primary.krn BACKUP_KERNEL_URL fat://ide0.0/backup/x86_64/kernel/primary.krn DIAG_URL fat://ide0.0/x86_64/diag/diag.krn GX_DIAG_URL fat://ide0.0/x86_64/diag/kernel FIRMWARE_URL fat://ide0.0/x86_64/firmware/SB_XV/firmware.img bootarg.mgwd.scsi_blade_uuid 81f5ff2d-e3a9-11e1-b677-fd0f5a25c34b bootarg.from.version 8.1P2 failoverToken SV01_16:49:17_2016:11:22 BOOT_DEVICE ide0.0 BOOT_FILE x86_64/freebsd/image2/kernel BIOS_INTERFACE 9FC3 BOOT_FLASH flash0a GX_PRIMARY_KERNEL_URL fat://ide0.0/x86_64/freebsd/image2/kernel GX_BACKUP_KERNEL_URL fat://ide0.0/x86_64/freebsd/image1/kernel ntap.init.kernelname x86_64/freebsd/image2/kernel AUTOBOOT true AUTOBOOT_FROM PRIMARY AUTO_FW_UPDATE true BOOTED_FROM OTHER boot_ontap autoboot ide0.0 boot_primary setenv BOOTED_FROM PRIMARY; boot -elf64 $GX_PRIMARY_KERNEL_URL $PRIMARY_KERNEL_URL boot_backup setenv BOOTED_FROM BACKUP; boot -elf64 $GX_BACKUP_KERNEL_URL $BACKUP_KERNEL_URL netboot setenv BOOTED_FROM NETWORK; boot -elf64 boot_diags boot -elf64 $GX_DIAG_URL $DIAG_URL ldkern load -elf64 $GX_PRIMARY_KERNEL_URL $PRIMARY_KERNEL_URL update_flash flash -backup $FIRMWARE_URL flash0a version printenv BIOS_VERSION LOADER_VERSION CF_BIOS_VERSION 4.4.0 CF_LOADER_VERSION 1.8 LOADER-A> boot_ontap Loading x86_64/freebsd/image2/kernel:.....0x100000/8455008 0x910360/1278312 Entry at 0x80158990 Loading x86_64/freebsd/image2/platform.ko:.0xa49000/655152 0xb97c40/694752 0xae8f40/39656 0xc41620/43152 0xaf2a28/86316 0xb07b54/63858 0xb174e0/140640 0xc4beb0/159120 0xb39a40/2024 0xc72c40/6072 0xb3a228/304 0xc743f8/912 0xb3a358/1680 0xc74788/5040 0xb3a9e8/960 0xc75b38/2880 0xb3ada8/184 0xc76678/552 0xb3ae60/448 0xb6f000/12918 0xb97b53/237 0xb72278/84120 0xb86b10/69699 Starting program at 0x80158990 NetApp Data ONTAP 8.1.4P1 7-Mode Copyright (C) 1992-2014 NetApp. All rights reserved. md1.uzip: 25536 x 16384 blocks md2.uzip: 5760 x 16384 blocks ******************************* * * * Press Ctrl-C for Boot Menu. * * * ******************************* ^C^C^C^CProcessing PCI error... (Ctrl-C don't work) Probing EXB(0,6,0) Probing EXB(0,7,0) Probing EXB(0,8,0) Probing EXB(0,9,0) report Dv(6,0,0) from error source register 0x600. ▒ Phoenix TrustedCore(tm) Server Copyright 1985-2006 Phoenix Technologies Ltd. All Rights Reserved BIOS version: 4.4.0 Portions Copyright (c) 2007-2009 NetApp. All Rights Reserved. CPU= Dual-Core AMD Opteron(tm) Processor 2216 X 1 Testing RAM 512MB RAM tested 4096MB RAM installed Fixed Disk 0: STEC Boot Loader version 1.8 Copyright (C) 2000-2003 Broadcom Corporation. Portions Copyright (C) 2002-2009 NetApp CPU Type: Dual-Core AMD Opteron(tm) Processor 2216 (Now it loops!) Starting AUTOBOOT press Ctrl-C to abort... Loading x86_64/freebsd/image2/kernel:.....0x100000/8455008 0x910360/1278312 Entry at 0x80158990 Loading x86_64/freebsd/image2/platform.ko:.0xa49000/655152 0xb97c40/694752 0xae8f40/39656 0xc41620/43152 0xaf2a28/86316 0xb07b54/63858 0xb174e0/140640 0xc4beb0/159120 0xb39a40/2024 0xc72c40/6072 0xb3a228/304 0xc743f8/912 0xb3a358/1680 0xc74788/5040 0xb3a9e8/960 0xc75b38/2880 0xb3ada8/184 0xc76678/552 0xb3ae60/448 0xb6f000/12918 0xb97b53/237 0xb72278/84120 0xb86b10/69699 Starting program at 0x80158990 NetApp Data ONTAP 8.1.4P1 7-Mode Copyright (C) 1992-2014 NetApp. All rights reserved. md1.uzip: 25536 x 16384 blocks md2.uzip: 5760 x 16384 blocks ******************************* * * * Press Ctrl-C for Boot Menu. * * * ******************************* ^CProcessing PCI error... (Ctrl-C don't works) Probing EXB(0,6,0) Probing EXB(0,7,0) Probing EXB(0,8,0) Probing EXB(0,9,0) report Dv(6,0,0) from error source register 0x600. ▒ Phoenix TrustedCore(tm) Server Copyright 1985-2006 Phoenix Technologies Ltd. All Rights Reserved BIOS version: 4.4.0 Portions Copyright (c) 2007-2009 NetApp. All Rights Reserved. CPU= Dual-Core AMD Opteron(tm) Processor 2216 X 1 Testing RAM 512MB RAM tested 4096MB RAM installed Fixed Disk 0: STEC Boot Loader version 1.8 Copyright (C) 2000-2003 Broadcom Corporation. Portions Copyright (C) 2002-2009 NetApp CPU Type: Dual-Core AMD Opteron(tm) Processor 2216 Starting AUTOBOOT press Ctrl-C to abort... (Ctrl-C works this time) Autoboot of PRIMARY image aborted by user. LOADER-A>
NetApp Release 8.1.4P1 7-Mode: Tue Feb 11 23:23:31 PST 2014
What I have tried so far: boot, boot_ontap, boot_backup, boot_primary, bye, power cycle.
Support contract has expired. It's an old system, the partner node works flawlessly and we are in the process of decommisioning both machines, but it would be better to have both machines online:).
Any idea how to reanimate SV01? Looks like a problem with the internal disk? Can I flash the fixed disk?
Regards
Uwe
Hi Uwe,
You can purchase one-time support for systems with expired entitlement. Contact NetApp Support for details.
In the meantime, it would be very helpful you had the initial panic/crash information recorded. It might still be there, if you collect an RLM diagnostic dump.
From RLM CLI: > rlm status -d
Look over the output for any interesting info about the initial cause of the crash and what the console logs and system event logs say.
Since it looks like a HW issue, it would also be a good idea to pull the PCM and check/re-seat all of the cards in the enclosure.
After reseating, it wouldn't be a bad idea to run system diagnostics. Here's the link to the Diagnostics Guide applicable to the FAS3140.
https://library.netapp.com/ecmdocs/ECMP1112531/html/ch1/overview.htm