ONTAP Hardware

Bad Disk Label

albertocerpa
98,232 Views

Hi,

I have a FAS250 filer with 14 Seagate 144 GB disks (Drive Part Number: X274_SCHT6146F10). Recently two of the disks failed, so I replaced them with Hitachi 144 GB disks (Drive Part Number: X274_HPYTA146F10), which has the same specs of X274_SCHT6146F10. But when the new disks were inserted, I got the following error messages:

Mon Nov 29 17:37:47 PST [ses.channel.rescanInitiated:info]: Initiating rescan on channel 0b.
Mon Nov 29 17:38:05 PST [raid.disk.inserted:info]: Disk 0b.18 Shelf 1 Bay 2 [NETAPP   X274_HPYTA146F10 NA03] S/N [V5Y692RA] has been inserted into the system
Mon Nov 29 17:38:05 PST [sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
Mon Nov 29 17:38:05 PST [raid.assim.disk.nolabels:error]: Disk 0b.18 Shelf 1 Bay 2 [NETAPP   X274_HPYTA146F10 NA03] S/N [V5Y692RA] has no valid labels. It will be taken out of service to prevent possible data loss.
Mon Nov 29 17:38:05 PST [raid.config.disk.bad.label:error]: Disk 0b.18 Shelf 1 Bay 2 [NETAPP   X274_HPYTA146F10 NA03] S/N [V5Y692RA] has bad label.
Mon Nov 29 17:38:05 PST [disk.fw.downrevWarning:warning]: 2 disks have downrev firmware that you need to update.
Mon Nov 29 17:38:08 PST [asup.post.sent:notice]: System Notification message posted to NetApp: System Notification from fs1 (DISK BAD LABEL) ERROR

I checked the volume status with the command "vol status -r", which outputs the following:

fs1> vol status -r
Volume vol0 (online, raid4) (block checksums)
  Plex /vol0/plex0 (online, normal, active)
    RAID group /vol0/plex0/rg0 (normal)

      RAID Disk    Device    HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------    ------    ------------- ---- ---- ---- ----- --------------    --------------
      parity      0b.28    0b    1   12  FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.17    0b    1   1   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.22    0b    1   6   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.19    0b    1   3   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.20    0b    1   4   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.21    0b    1   5   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.29    0b    1   13  FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.23    0b    1   7   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.24    0b    1   8   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.25    0b    1   9   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.26    0b    1   10  FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.27    0b    1   11  FC:B   -  FCAL 10000 136000/278528000  137104/280790184


Spare disks (empty)

Broken disks

RAID Disk    Device    HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------    ------    ------------- ---- ---- ---- ----- --------------    --------------
bad label    0b.16    0b    1   0   FC:B   -  FCAL 10000 136000/278528000  137422/281442144
bad label    0b.18    0b    1   2   FC:B   -  FCAL 10000 136000/278528000  137422/281442144

I tried to update the disk firmware with "disk_fw_update", but it gives this error message:

fs1> disk_fw_update 0b.16

    ***** 2 disks have been identified as having an incorrect
    ***** firmware revision level.
    ***** Please consult the man pages for disk_fw_update
    ***** to upgrade the firmware on these disks.

Disk Firmware: No disks in primary FC loop eligible for download

I read the man page as well as the online manual, and I don't seem to find how to resolve the bad disk label error. Is there any way to fix the problem and use the new disks as spares?  Or perhaps these disks are completely incompatible?

Thanks for any help you may provide.

Best regards,

-Al

1 ACCEPTED SOLUTION

scottgelb
97,932 Views

are these netapp oem drives or did you source elsewhere?  "priv set advanced ; disk unfail -s 0b.18" should resolve the bad label  ... then "disk zero spares" to zero the drive so it can be used immediately if needed... if you add to an aggregate it will zero the drive for you but I prefer to pre-zero a disk if not zeroed already.

View solution in original post

14 REPLIES 14

scottgelb
97,933 Views

are these netapp oem drives or did you source elsewhere?  "priv set advanced ; disk unfail -s 0b.18" should resolve the bad label  ... then "disk zero spares" to zero the drive so it can be used immediately if needed... if you add to an aggregate it will zero the drive for you but I prefer to pre-zero a disk if not zeroed already.

albertocerpa
97,834 Views

Thanks very much for your help. Your suggestion works and this is the output of "vol status -r":

fs1> vol status -r
Volume vol0 (online, raid4) (block checksums)
  Plex /vol0/plex0 (online, normal, active)
    RAID group /vol0/plex0/rg0 (normal)

      RAID Disk    Device    HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------    ------    ------------- ---- ---- ---- ----- --------------    --------------
      parity      0b.28    0b    1   12  FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.17    0b    1   1   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.22    0b    1   6   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.19    0b    1   3   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.20    0b    1   4   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.21    0b    1   5   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.29    0b    1   13  FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.23    0b    1   7   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.24    0b    1   8   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.25    0b    1   9   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.26    0b    1   10  FC:B   -  FCAL 10000 136000/278528000  137104/280790184
      data        0b.27    0b    1   11  FC:B   -  FCAL 10000 136000/278528000  137104/280790184


Spare disks

RAID Disk    Device    HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------    ------    ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare       0b.16    0b    1   0   FC:B   -  FCAL 10000 136000/278528000  137422/281442144
spare       0b.18    0b    1   2   FC:B   -  FCAL 10000 136000/278528000  137422/281442144
fs1>

We got these disks from ebay after two disk failed in our 10-year old filer. According to the description, they are coming from a decomissioned FAS system... Hopfully they can last for a while.

Thanks,

-Al

L5mgmt
56,943 Views

This worked for us also, we had to purchase 1 disk for an older unit, and this command worked and placed it into spare, we then did the zeroing command and its at 4% so assuming its going to work.


@scottgelb wrote:

are these netapp oem drives or did you source elsewhere?  "priv set advanced ; disk unfail -s 0b.18" should resolve the bad label  ... then "disk zero spares" to zero the drive so it can be used immediately if needed... if you add to an aggregate it will zero the drive for you but I prefer to pre-zero a disk if not zeroed already.


 

aborzenkov
97,831 Views

Were these disks used in NetApp with higher DataONTAP version before?

albertocerpa
97,830 Views

We got these disks from ebay, so it is not clear what DataONTAP version is used before. We updated DataONTAP to 7.3.3, the lastest GD version for our filer, right before the disk replacement.

nmilosevic
97,830 Views

I know old thread, but the following worked for me on Ontap 7.3.x, saw it in another thread once putting here as solution that worked for me also;

disk assign <diskid>

priv set diag

labelmaint isolate <diskid>

label wipe <diskid>

label wipev1 <diskid>

label makespare <diskid>

labelmaint unisolate

priv set

Allows you to reset disk label and get rid of error, you can also still zero it.

ThiagoBeier
51,294 Views

Hi guys, I followed all procedures here and I have the following after buying a new disk and replacing it (no warranty and we bought a new disk after one failed)

 

sysconfig -a

 

00.8 : NETAPP X298_WVULC01TSSS NA00 847.5GB 512B/sect (WD-WCAW32695363)
00.9 : NETAPP X298_WVULC01TSSS NA00 847.5GB 512B/sect (WD-WCAW32695170)
00.10: WDC WD1005FBYZ-01YSS RR07 847.5GB 512B/sect (WD-WMC6M0J4FK0J)
00.11: NETAPP X298_WVULC01TSSS NA00 847.5GB 512B/sect (WD-WCAW32621026)

 

Tue Mar 13 15:52:09 EDT [NETAPP-T: asup.smtp.host:info]: AutoSupport cannot connect to host mx1.domain.com (specified host not found) for message: HA Group Notification from NETAPP-T (DISK BAD LABEL) ERROR
Tue Mar 13 15:52:09 EDT [NETAPP-T: asup.smtp.retry:info]: AutoSupport mail (HA Group Notification from NETAPP-T (DISK BAD LABEL) ERROR) was not sent for host (0). The system will retry later to send the message

 

NETAPP-B> disk show -v
DISK OWNER POOL SERIAL NUMBER HOME
------------ ------------- ----- ------------- -------------
0c.00.8 -NETAPP-T(142246268) Pool0 WD-WCAW32695363 NOW-NETAPP-T(142246268)
0c.00.11 -NETAPP-T(142246268) Pool0 WD-WCAW32621026 NOW-NETAPP-T(142246268)
0c.00.2 -NETAPP-T(142246268) Pool0 WD-WCAW32695288 NOW-NETAPP-T(142246268)
0c.00.6 NETAPP-T(142246268) Pool0 WD-WCAW32699686 NOW-NETAPP-T(142246268)
0c.00.1 NETAPP-T(142246268) Pool0 WD-WCAW32699000 NOW-NETAPP-T(142246268)
0c.00.0 NETAPP-T(142246268) Pool0 WD-WCAW32513203 NOW-NETAPP-T(142246268)
0c.00.3 NETAPP-B(142246502) Pool0 WD-WCAW32621097 NOW-NETAPP-B(142246502)
0c.00.4 NETAPP-T(142246268) Pool0 WD-WCAW32714093 NOW-NETAPP-T(142246268)
0c.00.5 NETAPP-B(142246502) Pool0 WD-WCAW32700301 NOW-NETAPP-B(142246502)
0c.00.7 NETAPP-B(142246502) Pool0 WD-WCAW32695704 NOW-NETAPP-B(142246502)
0c.00.9 NETAPP-T(142246268) Pool0 WD-WCAW32695170 NOW-NETAPP-T(142246268)
0c.00.10 NETAPP-T(142246268) Pool0 WD-WMC6M0J4FK0J NOW-NETAPP-T(142246268)

SEIMTEXTER
51,291 Views

Looks to me like it doesn't have valid netapp firmware.  Netapp revisions, as far as I know anyway, all start with "NA", whereas yours starts with "RR".  I would return it to the vendor. 

AlexDawson
51,268 Views

Yes - all of our shipped drives should include NAxx firmware. We do use some other variations during testing, but RR is not one of them. I've seen suggestions that is an HP MSA firmware.

Randy_Lin
46,687 Views

Thanks

ivissupport
97,812 Views

Hi,

I have the same issue with some disks moved from a Nearstor VTL.

Use special boot menu and press " CTRL-C" many times and select maintenance mode.

You have to re-label the disks with the bad label. You can do that with running label makespare giving the name of the disk

Disks normally will appear as spares

Alfs29
95,609 Views

I'm in the same boat now. 😕

I have FAS2020 running 7.3.7p3 ... "learning to fly" (yes, i know that is is old, slow, doesn't come with all bells and whistles, etc .... )

Got used DS14MK4 shelf with drives from ghetto admin shop a.k.a. ebay.

But when i try to add those disks to my system i get bad raid label error.

Disks judging from their raid label v10 have been used in v8+ system.

My v7.3.7 supports raid labels only up to v9.

I have tried i guess all published solutions to wipe v10 label. .... including maintenance mode, label makespare, unfail -s ... etc ...

All i get is " ..... not permitted as the disk has a bad RAID version."

No, filler with ontap 8+ to make them spares and zero is not available 😞

 

Is there anything else i can try?

 

Thanks

SEIMTEXTER
91,835 Views

I've never had luck with getting the raid label to change upwards.  If a disk has been on a filer running ONTAP 8.X and wasn't decommissioned properly, I've always had to attach it to a filer running 8.X, run label makespare on it if it has a bad label, then zero and remove ownership before it will show up properly in a 7.X filer.

nmilosevic
92,119 Views

Do a sysconfig -r

If it shows volume names on the ghetto drives you added (as you call them), then you may need to destroy the raid volume data (those that were setup on another filer) before deleting labels.

use the info from sysconfig -r to identify the volume name that the introduced drives used to be a part of, then do (do this in diag mode);

*> vol offline <volumename>

*> vol destroy <volumename>

 

where <volumename> is the name of the volume the introduced drives where once a part of, then do the label maintenance as above. Hope that helps.

Public