ONTAP Hardware
ONTAP Hardware
hi All
i am seeing this error every 1 hr in message logs.
[NAS02:monitor.shelf.fault:CRITICAL]: Fault reported on disk storage shelf attached to channel 3a. Please check fans, power supplies, disks, and temperature sensors.
we replaced the cables, shelf I/O modules but the error still persist.
did any one had the same type of issue, if so how it has been resolved.
we are on 8.1 RC3 ver
thanks in advance
Solved! See The Solution
replaced with the new SAS cables which were sent from NetApp.
resolved the issue.
Definitely open a case on this... but in my experience it often requires a power cycle of the shelf (not always).
Get a support recommendation...but typically what we do when we see this...
First, check "environment shelf" and see if any errors...
next, replace/reseat cables (which you did already) and reseat/replace I/O modules (which you did)
If none of those fix it and no error in ONTAP... downtime to power cycle the shelf almost always is the fix that sticks.. just had one at a customer and worked with support for all workarounds until having to power cycle.. it doesn't occur often and wasn't urgent so we waited for their next downtime window and performed the quick maintenance to resolve the issue...same thing where every hour they had the error show up.
i power down the filers, disk shelves and powered up... still i see the same alert again and again..
Definitely open a case… some other type of failure or part replacement needed possibly… any errors in “environ shelf” output? The system is MPHA?
i see the below error
on shelf 0:
SAS connector attached element list: 1, 2, 3, 4; with error: 2
SAS cable information by element:
[1] Vendor: Molex Inc.
Type: QSFP copper 2m ID: 00 Swaps: 0
[2] Vendor: <N/A>
Type: <N/A> <N/A> <N/A> ID: <N/A> Swaps: 0
[3] Vendor: Molex Inc.
Type: QSFP copper 2m ID: 00 Swaps: 0
[4] Vendor: Molex Inc.
Type: QSFP copper 0.5m ID: 01 Swaps: 0
on shelf 1
SAS connector attached element list: 1, 2, 3, 4; with error: 1
SAS cable information by element:
[1] Vendor: <N/A>
Type: <N/A> <N/A> <N/A> ID: <N/A> Swaps: 0
[2] Vendor:
Type: <N/A> optical 0m ID: 00 Swaps: 0
[3] Vendor: Molex Inc.
Type: QSFP copper 0.5m ID: 00 Swaps: 0
[4] Vendor: Molex Inc.
Type: QSFP copper 5m ID: 01 Swaps: 1
Did support get back to you on this? Did you replace the SAS HBA too?
support replaced the SAS cables..
i requested them to do the health check on SAS card also.
SAS card is the only thing left I can think of…
replaced with the new SAS cables which were sent from NetApp.
resolved the issue.
Hi .......
we received this errors :
Sun Jan 5 02:00:00 CST [USTO-PFSX-X01:monitor.shelf.fault:CRITICAL]: Fault reported on disk storage shelf attached to channel 6a. Please check fans, power supplies, disks, and temperature sensors.
This is a non-disruptive steps that can be taken are as follows:
Replace the module and cable (already on site), confirm path redundancy and monitor for fault message to reoccur
If the fault message reoccurs , then proceed with the NDR shelf power cycle.
Filer USTO-PFSX-X01 > storage power_cycle shelf start -c 6a -s 3
Just to be perfectly upfront, these non-disruptive actions may not resolve the fault message.
But this plan is worth executing at this point before considering replacing a shelf.