ONTAP Hardware

FAS2750 node A / system alert PSU1 FRU is not present in the chassis

Terry_Lui
7,171 Views

PSU1 FRU is not present in the chassis alert message was show on node A. I tried unplug and plug again the PSU, but alert still here.  SP was tried reboot before, may i know what is the next step i can do. Many thanks.

logging as below:

-----------------------------------------------------------

WXXXXXXXXXX::> system health alert show
Node: WXXXXXXXXX-B
Resource: PSU1 FRU#042012002634
Severity: Critical
Indication Time: Mon Sep 06 07:02:21 2021
Suppress: false
Acknowledge: false
Probable Cause: PSU1 FRU is not present in the chassis 042012002634.
The nodes in this chassis are WXXXXXXXXX-A.
Possible Effect: The nodes in the chassis might not function
effectively or redundancy might be lost.
Corrective Actions: 1. Plug in PSU1 FRU correctly into the slot.
2. Refer to the Hardware specification guide for more information on the position of the field-replaceable unit (FRU) and ways to check or replace it.
3. Contact support personnel if the alert persists.

-------------------------------------------------------------------------------------------------------

event show:

monitor.fan.failed: Multiple fans has failed.
monitor.fru.info.unreadable: The inventory information of FRU PSU2 is not readable.
monitor.fru.info.unreadable: The inventory information of FRU PSU1 is not readable.
callhome.c.fan.fru.fault: Call home for CHASSIS FAN FRU FAILED: Multiple fans have failed
monitor.temp.unreadable: The controller temperature (Midplane 1 Temp) is not readable.
monitor.temp.unreadable: The controller temperature (Midplane 2 Temp) is not readable.
monitor.temp.unreadable: The controller temperature (Midplane 3 Temp) is not readable.
monitor.temp.unreadable: The controller temperature (Midplane 4 Temp) is not readable.
monitor.fan.failed: Multiple fans has failed.
monitor.fan.failed: Multiple fans has failed.
monitor.fan.failed: Multiple fans has failed.
monitor.fan.failed: Multiple fans has failed.

 

 

1 ACCEPTED SOLUTION

AlexDawson
7,073 Views

Open a case with the support center and start scheduling some downtime for a chassis swap, but I must stress this is very very uncommon.

 

The reboot and the upgrade should fix it.

View solution in original post

13 REPLIES 13

Terry_Lui
7,157 Views

Hi, thanks for your reply. But I was checked on system environment sensors show, node A PSU1 FRU state was fault.

-----------------------------------------------

WXXXXXXXX::> system environment sensors show

Node  Sensor             State     Value/Units

----       -------------      --------        -----------

WXXXXXXXXX-A

     PSU2 FRU               fault

                                   MULTIFAULT

     PSU1 FRU               fault

                                   MULTIFAULT

     SP Status             normal

                                   IPMI_HB_OK

     mSATA Status          normal

                                           OK

     mSATA Pres            normal

                                      PRESENT

     Partner Status        failed

     PSU1 Present          normal

                                      PRESENT

     PSU1 5V              

                      init-failed

                                        - mV         -        -       -       -

     PSU1 12V             

                      init-failed

                                        - mV         -        -       -       -

 

--------------------------------------------------------

NetApp also fault light on. 

Terry_Lui
7,085 Views

Thanks again, I will take a look for those KB.

andris
7,102 Views

Reboot the BMCs on both nodes in the chassis.

Then update them to BMC 11.6.

The sensor reading failures can indicate issues with IPMI communication or BMC unhealthiness.

Create a case with Tech Support, ideally.

Terry_Lui
7,084 Views

Cool, Thanks andris. But is that any services impact druing BMC upgrade?

AlexDawson
7,079 Views

BMC upgrades are generally non-disruptive - rebooting the BMC first will help ensure a smooth upgrade leading to non-disruptive activity. Command is "system service-processor reboot-sp -node nodename"

Terry_Lui
7,079 Views

Thanks for your reply Alex, i will try it later. I was saw one of KB said that to upgrade shelf firmware will help for this case. If i upgrade for shelf firmware also, did any services impact? 

AlexDawson
7,078 Views

Shelf firmware update is also generally non-disruptive, but it is important to perform the BMC reboot and BMC upgrade first.

Terry_Lui
7,076 Views

That cool, thanks a again Alex. If after upgrade SP and shelf firmware the alert still exist, what should i do on next step?

AlexDawson
7,074 Views

Open a case with the support center and start scheduling some downtime for a chassis swap, but I must stress this is very very uncommon.

 

The reboot and the upgrade should fix it.

Terry_Lui
7,071 Views

I hope it can fix it this problom by upgrade, I try to contect with customer to upgrade those part first. BTW, thanks for your help again Alex.

Terry_Lui
6,884 Views

Hi Alex, i tried to update SP and disk shelf firmware. but system environment sensors show still status on that as below:

PSU2      FRU      fault
                                      MULTIFAULT
PSU1       FRU     fault
                                    MULTIFAULT
SP           Status    normal
                             IPMI_HB_OK
mSATA Status    normal
                                            OK
mSATA Pres         normal
                                        PRESENT
Partner     Status     failed
PSU1 Present normal
PRESENT
PSU1 5V
init-failed
- mV - - - -
PSU1 12V
init-failed
- mV - - - -
PSU1 5V Curr

 

what should i do on next step?

Public