ONTAP Hardware

Error during disk scrubing

saudkhanafridi
6,654 Views

i got this message in log messages

"Block recommended for re-assignment on disk /aggr0/plex0/rg0/0c.27 shelf 1 bay 11 [NETAPP x 279 -......i tried to read the scrub process but did not get anything relating to the above messages. wether my disk is about to fail or ?/

we have FAS 3140 with FC disk DS 14 mk2 ESH4

kinldy suggest some reading to now what exactly going on there

kindly guide

Reagrds

7 REPLIES 7

columbus_admin
6,654 Views

This is a simple "bad block" message.  These are pretty common and unless the number reaches a threshold, the disk will not fail.  There are many blocks that can be lost without any disruption to data.  I would not worry unless you see this occurring multiple times on the same disk.

I have had 10 in one scrub, but across multiple disks.  OnTAP is working like it should, I would not be worried.

- Scott

saudkhanafridi
6,654 Views

thank you for your reply, Scott , and as u said i would not worry unless it occur multiple time on same disk.so this is it.this error is reported on the 0c.27 more then 5 times.

and what is the threshold??

looking for Reply

Regards

columbus_admin
6,654 Views

Honestly, I cannot say...I have seen as many as 10(if I recall correctly) without a disk failing.

I would guess that the algorithm takes disk geometry into account, so that multiple bad blocks in a localized area would be weighted as more serious than the same number, or maybe even more that are spread all over the disk.

There is an option disk.recovery_needed.count that seems to be tied to it, but this option specifically addresses how the filer reacts BEFORE failing the disk.

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727/pdfs/ontap/rnote.pdf

And according to the NOW site:
A disk is put into a recovery-needed state to complete internal sector reassignments.

So I would take that to mean that 5 bad blocks are required for the disk to reassign sectors to avoid those areas, but that is only a guess.  The best option might be to open a case with NetApp via the NOW site as a P4, low priority question, and see if they can give more exact information.

- Scott

saudkhanafridi
6,654 Views

any way ..let me read what you refer and thank you very much for your time sir

any more good reading regarding all above , if you refer ,would help more.

Regards

columbus_admin
6,654 Views

Here is a 7.1 doc, it is the newest I can find.

Disk media error thresholds that trigger a disk failure request include

  • More than 25 media errors (that are not related to disk scrub activity) occurring on a disk within a 10-minute period
  • Three or more media errors occurring on the same sector of a disk

http://now.netapp.com/NOW/knowledge/docs/ontap/rel713/html/ontap/mgmtsag/4raid22.htm

- Scott

saudkhanafridi
6,654 Views

thank you sir, it really helped.

regards

To_Bi
6,540 Views

Hi

 

Does sector means only the sector adress or sector + head + cyl ?

 

for eg

 

Disk 0a.17 grown defect list

Defect   (cyl    head   sector)

  243        12319043843
  359        025319043843
  360        11819043843
  364        11119043843
  368        11019043843
  402        224919043843
  410        025219043843
  473        125219043843

 

should fail ? or not ?

 

Regards

 

To_Bi

Public