Talk and ask questions about NetApp FAS and AFF series unified storage systems. Talk with other members how to optimize these powerful data storage systems.
Talk and ask questions about NetApp FAS and AFF series unified storage systems. Talk with other members how to optimize these powerful data storage systems.
**[SolidFire SF4805] All 10 Drives Failed Simultaneously — Suspected SAS Controller Failure ** Hi Community, I'm dealing with a SolidFire SF4805 node issue and would appreciate any advice, especially since this unit is out of warranty. --- **Environment** - Cluster: 4-node SolidFire SF4805, Element OS 11.8.0.23 - Affected Node: SF03 (Node ID: 3) - Cluster is still operational with remaining 3 nodes --- **Symptoms** - All 10 Samsung SSDs on SF03 show Status = **failed** simultaneously - Active Drives on SF03 = 0 - Node Status = active (node is online but serving no storage) - Replication Port = "-" (not participating in cluster replication) **Active Alerts:** - `hardwareConfigMismatch` — MPTSAS_BIOS_VERSION = Unknown (expected != Unknown) - `hardwareConfigMismatch` — MPTSAS_FIRMWARE_VERSION = Unknown (expected != Unknown) - `irqBalanceFailed` — mpt3sas0-msix0 through msix7 interrupts not found - `networkConfig` — eth1 and eth3 down - `notUsingLACPBondMode` — Bond10G not using LACP **Hardware Check Output (xCheck):** ``` MPTSAS_BIOS_VERSION: Passed=false, actual=Unknown MPTSAS_FIRMWARE_VERSION: Passed=false, actual=Unknown (All other components: CPU, RAM, NIC, BIOS, iDRAC → Passed=true) ``` --- **My Analysis** All symptoms point to the **SAS HBA controller (mpt3sas / LSI) being undetectable** by the OS. Since all 10 drives failed at exactly the same time rather than individually, I believe the drives themselves are likely still healthy — the controller is simply not being recognized on the PCIe bus. Firmware versions are also behind: - iDRAC: running 2.40.40.40 (current: 2.75.75.75) - BIOS: running 2.2.5 (current: 2.8.0) --- **Questions** 1. Can anyone confirm this is a SAS controller hardware failure rather than a firmware/software issue? 2. What is the exact SAS controller model used in the SF4805? (Trying to source a replacement) 3. Has anyone successfully replaced the SAS controller on an SF-series node and recovered the drives/data? 4. If I reseat or replace the controller and the node rejoins the cluster, will Element OS automatically re-add the drives, or is there a manual process? 5. Any risk of data loss on the drives themselves if the controller is replaced? --- **What I've Tried** - Verified all alerts in Element OS UI (Reporting → Alerts) - Confirmed Node Details: all 10 drive slots showing failed - Reviewed hardware check JSON output - Cross-referenced firmware versions against NetApp docs: https://docs.netapp.com/us-en/element-software/hardware/fw_storage_nodes.html#sf_nodes --- Any guidance is greatly appreciated. Thanks in advance!
... View more
Hello All, One of my clients has a FAS2720, and recently both the node controller and the boot flash failed at the same time. I found the replacement manual but couldn't find any instructions on how to replace them both at once. Has anyone else come across something like this
... View more
can i assign aggr ownership and boot ontap 9.8 aggr and system vol0 to 9.13 on fas2820 ? or do i need to bring a fas2750 to step over 9.10 or so in the process?
... View more
Dear community, anybody can share with me the digitalized Hardware Universe Poster 2025 version? I couldn't find it in HWU. Regards, Karoly
... View more