ONTAP Hardware
ONTAP Hardware
Hi NetApp Community,
I have a NetApp FAS2240 running Data Ontap 8.2 in 7-mode. We seem to have hit the following bug where the SSD disks have been up for 70,000 hours / 8 years. The SSD disks which were configured as the FlashPool have failed as the firmware was not updated.
https://mysupport.netapp.com/site/bugs-online/product/ONTAP/BURT/1335350
The failed SSD's have been replaced with reconditioned SSD disks from a 3rd party supplier, but we are getting plex errors.
Does anyone know how to resolve this issue ?
Solved! See The Solution
Flashpool doesn't work that way - The data lives on the Flashpool RG.
"Data inserted into the cache by using the write caching policy exists only in cache; there is no copy in HDDs. Flash Pool cache is RAID protected."
Also to note: you can't remove the flashcache RG from the aggr without destroying the aggr.
Have you tried logging a case ? You may get a limited support considering the ontap version on your end of life filer. I don't know the answer for this issue but just a thought - if this is a flashpool, can you not destroy it and rebuild, considering it's a cache to the data-aggregate ?
Flashpool doesn't work that way - The data lives on the Flashpool RG.
"Data inserted into the cache by using the write caching policy exists only in cache; there is no copy in HDDs. Flash Pool cache is RAID protected."
Also to note: you can't remove the flashcache RG from the aggr without destroying the aggr.
Thanks for the reply. I was thinking of this as an option myself, but cannot find the commands on how to do this.
This issue/KB/BURT came up the other day with a fellow A-Teamer and I went looking through the BURT. Unfortunately there didn't look like there was a way to reset/revive the drives in the dead raid group.
There is most likely data on the SSDs that you replaced, if you do plan to send it out for recovery, they will most likely want those too.
Hi SpindleNinja,
Thanks for the replies. Most of the documentation about FlashPool states that it is not possible to disable the FlashPool without destroying the aggregate, however the KB article below mentions that it is possible to disable FlashPool, but it seems this would need to be done by NetApp Support.
We may need to log a one of support call with NetApp to resolve.
The only other thing that I can think of would be to reinitialise the FlashPool, but I have seen no information on how to do this from a maintenance mode boot.
Yes, log a case with NetApp. Do let us know what they suggest for your case ?
Yup per the KB, support l2 required for the procedure.
Re the reinitialize comment, Are you trying to just get an aggr online to use at this point, or still trying to recover it? The "disabling" won't help with the recovery part unfortunately.
If you just want a new aggr to work off of, you can try to delete the offline/dead one and start over after zeroing the drives.
Hi SpindleNinja,
We are trying to bring the aggregate back online as we need to recover data from it.
Is there anyway to initialize the FlashPool RAID group in maintenance mode ?
Not that I'm aware of.
Though, typically, when you do any Initialize it zero's out drives, which in order to attempt any recovery you'll need to keep the original drives intact as they are.
What's the aggrs status? I'm assuming it's showing offline though.
Though, if it's showing as online, restricted or degraded you can try wafliron - (this is typically done under the recommendation of support, I think this system and ONTAP ver are currently out of support in general.
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Overview_of_wafliron. note - I've not ever heard it being used in the context of fixing or recovering a hybrid aggr though.
There might be some companies out there that specialize in data recovery of this sorts. Let me see if I can get a name.
Give these folks a look - https://www.ontrack.com/en-gb
Thanks for the link for recovery.
The aggregate is currently offline as one of the plex's has an error.
Hi, do you have resolved this problem.
Thanks a lot.
The only option is to engage a third party data recovery company such as Kroll OnTrack. If the flashpool SSDs fail, the whole aggregate fails.