ONTAP Hardware

Error during disk scrubing

saudkhanafridi

i got this message in log messages

"Block recommended for re-assignment on disk /aggr0/plex0/rg0/0c.27 shelf 1 bay 11 [NETAPP x 279 -......i tried to read the scrub process but did not get anything relating to the above messages. wether my disk is about to fail or ?/

we have FAS 3140 with FC disk DS 14 mk2 ESH4

kinldy suggest some reading to now what exactly going on there

kindly guide

Reagrds

7 REPLIES 7

columbus_admin

This is a simple "bad block" message.  These are pretty common and unless the number reaches a threshold, the disk will not fail.  There are many blocks that can be lost without any disruption to data.  I would not worry unless you see this occurring multiple times on the same disk.

I have had 10 in one scrub, but across multiple disks.  OnTAP is working like it should, I would not be worried.

- Scott

thank you for your reply, Scott , and as u said i would not worry unless it occur multiple time on same disk.so this is it.this error is reported on the 0c.27 more then 5 times.

and what is the threshold??

looking for Reply

Regards

Honestly, I cannot say...I have seen as many as 10(if I recall correctly) without a disk failing.

I would guess that the algorithm takes disk geometry into account, so that multiple bad blocks in a localized area would be weighted as more serious than the same number, or maybe even more that are spread all over the disk.

There is an option disk.recovery_needed.count that seems to be tied to it, but this option specifically addresses how the filer reacts BEFORE failing the disk.

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727/pdfs/ontap/rnote.pdf

And according to the NOW site:
A disk is put into a recovery-needed state to complete internal sector reassignments.

So I would take that to mean that 5 bad blocks are required for the disk to reassign sectors to avoid those areas, but that is only a guess.  The best option might be to open a case with NetApp via the NOW site as a P4, low priority question, and see if they can give more exact information.

- Scott

any way ..let me read what you refer and thank you very much for your time sir

any more good reading regarding all above , if you refer ,would help more.

Regards

Here is a 7.1 doc, it is the newest I can find.

Disk media error thresholds that trigger a disk failure request include

  • More than 25 media errors (that are not related to disk scrub activity) occurring on a disk within a 10-minute period
  • Three or more media errors occurring on the same sector of a disk

http://now.netapp.com/NOW/knowledge/docs/ontap/rel713/html/ontap/mgmtsag/4raid22.htm

- Scott

To_Bi

Hi

 

Does sector means only the sector adress or sector + head + cyl ?

 

for eg

 

Disk 0a.17 grown defect list

Defect   (cyl    head   sector)

  243        12319043843
  359        025319043843
  360        11819043843
  364        11119043843
  368        11019043843
  402        224919043843
  410        025219043843
  473        125219043843

 

should fail ? or not ?

 

Regards

 

To_Bi

thank you sir, it really helped.

regards

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public