ONTAP Hardware

FAS2020 - failing controller?

UNDECIDED
5,268 Views

We have a FAS2020 that have a failing controller.

After replacing the faulty controller with a new one (all done by the manual) we got the following error:

Waiting for nvram battery chargeupPress Ctrl-C for Maintenance menu to release disks.

Waiting for nvram battery chargeup

Due to the above error the controller wouldn’t boot.

We replaced the battery with a new one and got the following error:

FAS2020 Motherboard Diagnostic

------------------------------

Performing comprehensive motherboard diagnostic

--CPU0 (2199MHz) ID = 0xf29 Microcode Rev = 0x2f

--Northbridge Rev ID = 12

CPU/NorthBridge check ....................... PASSED

Southbridge check ........................... PASSED

Memory interface test ....................... PASSED

Performing comprehensive GBE test on e0a

Software reset test ......................... PASSED

EEPROM test ................................. PASSED

Interrupt test .............................. PASSED

Interrupt quick test ........................ PASSED

Internal 10B loopback test .................. PASSED

Internal 100B loopback test ................. PASSED

Internal 1000B loopback test ................ PASSED

External loopback test(xtnd only) ........... SKIPPED

****** Comprehensive GBE test ................... PASSED

Performing comprehensive GBE test on e0b

Software reset test ......................... PASSED

EEPROM test ................................. PASSED

Interrupt test .............................. PASSED

Interrupt quick test ........................ PASSED

Internal 10B loopback test .................. PASSED

Internal 100B loopback test ................. PASSED

Internal 1000B loopback test ................ PASSED

External loopback test(xtnd only) ........... SKIPPED

****** Comprehensive GBE test ................... PASSED

 

Testing FCAL card on channel 0a

 

Performing comprehensive FCAL test on channel 0a

FCAL 0a Self test ........................... PASSED

FCAL 0a Interrupt test ...................... PASSED

FCAL 0a Internal loopback test .............. PASSED

FCAL 0a bus reset test (xtnd only) .......... SKIPPED

FCAL 0a Ext loop test (xtnd only) ........... SKIPPED

FCAL 0a Read only bus test (xtnd only) ...... SKIPPED

FCAL 0a R/W bus test (mfg only) ............. SKIPPED

FCAL 0a Disktest read (xtnd only) ........... SKIPPED

FCAL 0a Disktest R/W (mfg only) ............. SKIPPED

****** Comprehensive FCAL test .................. PASSED

Testing FCAL card on channel 0b

Performing comprehensive FCAL test on channel 0b

FCAL 0b Self test ........................... PASSED

FCAL 0b Interrupt test ...................... PASSED

FCAL 0b Internal loopback test .............. PASSED

FCAL 0b bus reset test (xtnd only) .......... SKIPPED

FCAL 0b Ext loop test (xtnd only) ........... SKIPPED

FCAL 0b Read only bus test (xtnd only) ...... SKIPPED

FCAL 0b R/W bus test (mfg only) ............. SKIPPED

FCAL 0b Disktest read (xtnd only) ........... SKIPPED

FCAL 0b Disktest R/W (mfg only) ............. SKIPPED

****** Comprehensive FCAL test .................. PASSED

ONBOARD SAS present:

    Slot 0 58 Single Channel [Lsi Rev 0x4]

Testing SAS card on channel 0c

Performing comprehensive SAS test on channel 0c

SAS 0c Self  test ........................... PASSED

SAS 0c Interrupt test ....................... PASSED

SAS 0c Internal loopback test. .............. SKIPPED

SAS 0c External loopback test. .............. SKIPPED

Please wait while updating SES structure after reset, channel 0c

Read only bus test 0c: 0 disks .............. SKIPPED

SAS 0c R/W bus test (mfg only) .............. SKIPPED

SAS 0c Disktest read (xtnd only) ............ SKIPPED

SAS 0c Disktest R/W (mfg only) .............. SKIPPED

****** Comprehensive SAS test ................... PASSED

Internal loopback test ...................... PASSED

Link test(xtnd only) ........................ SKIPPED

****** Comprehensive IB test .................... PASSED

Performing comprehensive BMC test

ERROR DTH0033: Failed to Retrieve the BMC's Self Test Information.

BMC Self results shows some errors!

Now Initializing a new BMC Self Test. Please wait.

*** Error: Can't determine if BMC is in update mode. Error = ff

ERROR DTH0033: Failed to Retrieve the BMC's Self Test Information.

BMC Self Test ............................... FAILED

ERROR DTH0016: Failed to Reserve the Sensor Repository.

BMC SDR Read Test ........................... FAILED

ERROR DTH0017: Failed to Read the System Event Log Information.

BMC SEL Read Test ........................... FAILED

The BMC does not control the LCD on this platform.

ERROR DTH0022: Failed to Get the System Event Log timer.

-ERROR DTH0022: Failed to Get the System Event Log timer.

BMC Timer Test .............................. FAILED

DIAG: env_shutdown called:

BMC failed reading SDR : Write error (ff)

   

PANIC: assertion failed: file "../driver/environ/phys_drvs/bmc.c", line 882

 

PANIC: assertion failed: file "../driver/environ/phys_drvs/bmc.c", line 882

version: NetApp Release Diagnostic_5.4.3: Tue Oct  6 15:00:13 PDT 2009

cc flags: 8m

   

Also it gives this message continually...

Mon May 31 15:12:08 GMT [bmc.batt.seal:err]: Cannot reseal battery (cmd=0x20)

Mon May 31 15:22:12 GMT [bmc.batt.unseal:err]: Cannot unseal battery (cmd=0x414).

Mon May 31 15:22:12 GMT [bmc.batt.seal:err]: Cannot reseal battery (cmd=0x20)

Waiting for nvram battery chargeup

Waiting for nvram battery chargeup

Mon May 31 15:32:16 GMT [bmc.batt.unseal:err]: Cannot unseal battery (cmd=0x414).

Mon May 31 15:32:16 GMT [bmc.batt.seal:err]: Cannot reseal battery (cmd=0x20)

Waiting for nvram battery chargeup

Mon May 31 15:42:21 GMT [bmc.batt.unseal:err]: Cannot unseal battery (cmd=0x414).

Mon May 31 15:42:21 GMT [bmc.batt.seal:err]: Cannot reseal battery (cmd=0x20)

Waiting for nvram battery chargeup

Mon May 31 15:52:25 GMT [bmc.batt.unseal:err]: Cannot unseal battery (cmd=0x414).

Mon May 31 15:52:25 GMT [bmc.batt.seal:err]: Cannot reseal battery (cmd=0x20)

We also tried the following: Reseat the battery, then let the system stand in LOADER prompt for a couple of hours to see if it will recharge (recommended by NetApp)

Same problem:

Waiting for nvram battery chargeup

Waiting for nvram battery chargeup

Waiting for nvram battery chargeup

We got another battery just to make sure that there was nothing wrong with the second one and still the same!!

Then we decided to replace the controller once more!

We replaced it and here are the messages as the node boots up..

AMI BIOS8 Modular BIOS

Copyright (C) 1985-2006,  American Megatrends, Inc. All Rights Reserved

Portions Copyright (C) 2006 Network Appliance, Inc. All Rights Reserved

BIOS Version 3.0

Mon Apr 30 11:27:39 GMT [bmc.batt.unseal:err]: Cannot unseal battery (cmd=0x414).

Mon Apr 30 11:27:39 GMT [bmc.batt.seal:err]: Cannot reseal battery (cmd=0x20)

+++++++++++++++++++++++++

Boot Loader version 1.3

Copyright (C) 2000,2001,2002,2003 Broadcom Corporation.

Portions Copyright (C) 2002-2006 Network Appliance Inc.

2048MB RAM installed

CPU Type: Mobile Intel(R) Celeron(R) CPU 2.20GHz

Starting AUTOBOOT press Ctrl-C to abort...

Loading:...................0x200000/41406612 0x297d094/14083740 0x36eb730/1732450 0x3892692/6 Entry at 0x00200000

Starting program at 0x00200000

Press CTRL-C for special boot menu

Mon Apr 30 11:29:28 GMT [cf.nm.nicTransitionUp:info]: Interconnect link 0 is UP

Data ONTAP Release 7.3.2: Thu Oct 15 04:45:30 PDT 2009 (IBM)

Copyright (c) 1992-2009 NetApp.

Starting boot on Mon Apr 30 11:28:20 GMT 2012

Mon Apr 30 11:29:37 GMT [nvram.battery.turned.on:info]: The NVRAM battery is turned ON. It is turned OFF during system shutdown.

Shutting down thMon Apr 30 11:31:21 GMT [nvmem.battery.sensor.unread:info]: The battery state of the battery-backed memory (NVMEM) Batt 8.0V is not readable.

Mon Apr 30 11:31:21 GMT [nvmem.battery.sensor.unread:info]: The battery state of the battery-backed memory (NVMEM) Batt Amp is not readable.

Mon Apr 30 11:31:21 GMT [nvmem.battery.unreadable:CRITICAL]: The battery sensor of the battery-backed memory (NVMEM) Batt Run Time is not readable.

Mon Apr 30 11:31:21 GMT [nvmem.battery.sensor.unread:info]: The battery state of the battery-backed memory (NVMEM) Batt Temp is not readable.

Mon Apr 30 11:31:21 GMT [nvmem.voltage.high:CRITICAL]: The NVMEM supply voltage is high and the system is at a high risk of data loss if power fails.

e system because of NVMEM 8.0V is in critical high state: Current reading is 10420 mV, critical high threshold is 8604 mV.

System will be powered down in 60 seconds

error saving boot reason: could not flush environment variables-----MINI-MONITOR-----

    ?           bye         sync

Any help much appreciated, we've been struggling with this for some time now!

2 REPLIES 2

scottgelb
5,268 Views

What did support say? Escalate the case. Not much you can do when Rma gear doesn't work other than escalate. See if fixable or if another Rma is needed. An overcharged battery I haven't seen before.

clackamas
5,268 Views

Not sure about the 2020 but I know on the 3240 when you replace the MB you have to move the old battery to the new MB.  You do this before you swap the RAM and little CF card under the blue box.

Public