AFF

EF570(DE6000F) : failed 2drives in RAID6

SoichiFromLenovo
5,910 Views

If a heavy load with an access size of 4KB (Queue=128/Random=100%/Write=100%) is applied, 2 drives in RAID 6 will fail.
This will definitely happen and failed drive cannot be reused.

Firmware:08.62.00.02.001

NVSRAM:N5700-862870-D0

*8.53.x does not have this failure.

who knows?


FX0001.png
Drive
Location: Manufacturer: Product ID: Drive Type: Capacity: Drive Firmware: FPGA Version:
Shelf 99, Bay 0 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 1 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 2 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 3 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 4 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 5 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 6 TOSHIBA PX05SVB080 SAS Not Available LE01 Not Available
Shelf 99, Bay 7 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 8 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 9 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 10 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 11 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 12 TOSHIBA PX05SVB080 SAS Not Available LE01 Not Available
Shelf 99, Bay 13 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 14 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 15 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 16 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 17 TOSHIBA PX05SVB080 SAS 739.712 GB LE01 Not Available
Shelf 99, Bay 18 TOSHIBA PX05SVB080 SAS Not Available LE01 Not Available
Shelf 99, Bay 19 TOSHIBA PX05SVB080 SAS Not Available LE01 Not Available
Shelf 99, Bay 20 TOSHIBA KPM51VUG800G SAS 739.712 GB LE02 Not Available
Shelf 99, Bay 21 TOSHIBA KPM51VUG800G SAS 739.712 GB LE02 Not Available
Shelf 99, Bay 22 TOSHIBA KPM51VUG800G SAS 739.712 GB LE02 Not Available
Shelf 99, Bay 23 TOSHIBA KPM51VUG800G SAS 739.712 GB LE02 Not Available
1 ACCEPTED SOLUTION

NetApp_RZ
5,670 Views

Hello Soichi,

Were you able to open a support case for this?
We will need to look at a full support bundle and trace buffers collection after the failure to understand why the drives are failing during your IO test.

View solution in original post

9 REPLIES 9

SpindleNinja
5,857 Views

can you clarify your question?  or are you stating a bug that needs to be looked at? 

 

andris
5,845 Views

It sounds like you should open a technical case with NetApp so that it can be investigated.

SoichiFromLenovo
5,661 Views

Thanx!! yes.

SoichiFromLenovo
5,658 Views

yes, i opend support request now.
I checked to see if anyone knew about this obstacle.
Thanxx!!

NetApp_RZ
5,671 Views

Hello Soichi,

Were you able to open a support case for this?
We will need to look at a full support bundle and trace buffers collection after the failure to understand why the drives are failing during your IO test.

SoichiFromLenovo
5,613 Views

Hi


Thanxx for your reply.
I opened support case via IBM Support (IBM MCC) now.
This product is sold by Lenovo and must go through IBM Support.
I will add it within the range that can be disclosed at a later date.
Thanxx!



NetApp_RZ
5,610 Views

Soichi,

 

Thanks for the update.
I was looking in our system for a possible case surrounding the issue as I am curious too about what the drives are doing to hit fail criteria during that IO test.
Could be timeouts or aborts but could be other things too.
Either way, IBM/Lenovo does have escalation paths into NetApp should the need arise to get a deeper look at the issue so will keep an eye on this thread.

Thanks  🙂

SoichiFromLenovo
5,168 Views

I got an answer from support.
The cause of this failure seems to be the firmware of the SSD.
After updating the SSD firmware to LE03, this failure disappeared.

The firmware version of the failed drive's firmware version is LE01.
and details on fix lists are generally not disclosed by drive vendors.

If you are using at least PX05SVB080,PX05SVB160,i recommend updating the firmware right now.


thanxx!!

NetApp_RZ
5,164 Views

Thank you so much for the update Soichi,

Very glad to hear the issue is resolved now.

Now that I know what drive model it was I looked to see if NetApp also deploys the same drive and we do.
For both those drives we too have also switched from MS01 to MS03 and the public info we have for MS03 also states slow performance and drive fail issues.

My understanding is that Lenovo's firmware versions are the same as ours with just the first two letters changed from either MS or NE to LE.

https://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=1249633

Public