ONTAP Hardware

FAS2520 Failed flash pool, bug 1335350

LBCena
645 Views

Hello NA Community,

 

I had to power off our old FAS2520 for a maintenance and I think all the SSDs of our flashpool hit the bug 1335350 where they failed after a power cycle and I'm pretty sure we exceeded 70,000 power on hours.

 

The bug details says to contact support, unfortunately this is now an outdated system with an outdated ONTAP version (otherwise I wouldn't have had that issue), so I'm not even able to create a support case for that system anymore. Did someone here encounter this issue and was there a way to make the SSDs that exceed 70k power on hours work again? I assume updating the FAS2520 now wouldn't be useful as the system will be unable to flash a new firmware on the SDDs that are marked as failed/broken.

 

Thank you in advance.

 
1 ACCEPTED SOLUTION

andris
433 Views

If your drive model string matches the ones listed in Bug 1335350 and in this Support Bulletin:
SU448: [Impact: Critical] SSD (PHM2*) firmware to prevent data loss / unavailability

 

Then the drive is not recoverable in the field. You would need to have support entitlement to pursue recovery efforts that would involve NetApp resources.

View solution in original post

2 REPLIES 2

andris
434 Views

If your drive model string matches the ones listed in Bug 1335350 and in this Support Bulletin:
SU448: [Impact: Critical] SSD (PHM2*) firmware to prevent data loss / unavailability

 

Then the drive is not recoverable in the field. You would need to have support entitlement to pursue recovery efforts that would involve NetApp resources.

LBCena
247 Views

That's exactly what I was worried about, I confirm the model string of the SSDs matches, the system is 8+ years old so they most likely ran for more than 70k hours, and ONTAP version is 9.5P9 so the firmware fix wasn't applied on the disks.

 

To add insult to injury, the impacted aggregate also hosted the root volume of our only SVM (for NFS & CIFS), so clients machines had no access to any data on the filer anymore.

 

Fortunately, we had a snapvault/mirror on another system, so I first created a new root volume for the SVM on a working aggregate, made the SVM use that volume, and then destroyed (force delete with diag privileges) the broken aggregate, recreated it without the flashpool, recreated the volumes, reset/checked the junction paths and various policies, and finally restored the data from the vault.

2.13.0.0
Public