ONTAP Hardware

Netapp FAS250 filer broken disk replacement procedure. How?

YHCHEN168
16,863 Views

Dear all,

    We have a  FAS250 filer that has 7 FC-disk (300GB). The bay 0 disk failed and the system show below error messages when we run "vol status" command:

nas> vol status

         Volume State      Status            Options

           vol0 online     raid_dp, flex     nosnap=on, nosnapdir=on

                           degraded

       root_vol online     raid_dp, flex     root, nosnap=on, nosnapdir=on

                           degraded

   And below messages are "storage show disk -x" command:

DISK  SHELF BAY SERIAL       VENDOR   MODEL            REV

----- --------- ------------ -------- ---------------- ----

0b.17   1    1  DH07P870CVJL NETAPP   X276_FAL9E288F10 NA05

0b.18   1    2  DH07P870CW8S NETAPP   X276_FAL9E288F10 NA05

0b.19   1    3  DH07P870CWKM NETAPP   X276_FAL9E288F10 NA05

0b.20   1    4  DH07P870D0MK NETAPP   X276_FAL9E288F10 NA05

0b.21   1    5  DH07P870CYK1 NETAPP   X276_FAL9E288F10 NA05

0b.22   1    6  DH07P870CWHC NETAPP   X276_FAL9E288F10 NA05

   Bay 0 was remove by hand due to the system will halted if Bay 0 disk is remained on slot. We have a used 300GB FC-DISK, it was take out from previous older Netapp filer. Now, can someone help us how to replace failed disk drive(Bay0) from a used disk drive? We have tried to inserted this used disk driver to Bay0, But nothing happen? "aggr status" display:

Aggregate aggr0 (online, raid_dp, degraded) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (degraded)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phy
s (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    ---
-----------
      dparity   FAILED          N/A                   272000/557056000
      parity    0b.17   0b    1   1   FC:B   -  FCAL 10000 272000/557056000  280
104/573653840
      data      0b.18   0b    1   2   FC:B   -  FCAL 10000 272000/557056000  280
104/573653840
      data      0b.19   0b    1   3   FC:B   -  FCAL 10000 272000/557056000  280
104/573653840
      data      0b.20   0b    1   4   FC:B   -  FCAL 10000 272000/557056000  280
104/573653840
      data      0b.22   0b    1   6   FC:B   -  FCAL 10000 272000/557056000  280
104/573653840


Spare disks (empty)

-------------------------------------

Any help are appreciated. Thanks

Y.H.Chen

10 REPLIES 10

martin_fisher
16,748 Views

Hi - when the replacement drive is inserted into Bay0, it will more than likely have old meta data present on the disks.

Run a disk show -v when inserted and the system should show the disk as inserted but not assigned to the system.

Was this disk removed from a HA system?

The disk will need to be assigned to the system and then zeroed, before it will appear as a spare. once allocated as spare, the system should then rebuild the aggr.

disk assign 0b.XX

or

disk assign all (if you have no other unassigned disks present in the system).

then

disk zero spares

Martin

YHCHEN168
16,748 Views

Dear Martin,

   Thanks for your reply.

   Here is the "sysconfig" command output:

        NetApp Release 7.2.4: Fri Nov 16 00:34:57 PST 2007

        System ID: 008425xxxx (nas)

        System Serial Number: 207xxxx (nas)

        System Rev: B0

        slot 0: System Board

                Processors:         2

                Processor revision: B2

                Processor type:     1250

                Memory Size:        510 MB

        slot 0: FC Host Adapter 0b

                6 Disks:             1632.0GB

                1 shelf with EFH

        slot 0: FC Host Adapter 0c

        slot 0: Dual SB1250-Gigabit Ethernet Controller

                e0a MAC Address:    00:a0:98:05:47:68 (auto-100tx-fd-up)

                e0b MAC Address:    00:a0:98:05:47:69 (auto-100tx-fd-up)

        slot 0: NetApp ATA/IDE Adapter 0a (0x00000000000001f0)

                0a.0                 249MB

   I have try "disk assign" command, but it looks like no "disk assign" options.  Maybe it is 7.2.4 version problem ?

   Yes this disk was removed from another NAS filer, It should be has old meta data on it. Any though?

   Thanks!

  Y.H.Chen

martin_fisher
16,748 Views

Have you tried the CLI in priv set advanced for the disk commands. Can't remember what commands are present that far back for such an old version of ONTAP

aborzenkov
16,748 Views

Which version of DataONTAP was running on the filer from which this disk was taken?

YHCHEN168
16,748 Views

Dear aborzenkov,

    We have a broken Netapp filer and no plan to fix it. So we keep broken filer disk drver as spare for FAS250 filer. Due to the broken Netapp filer was long time can not power on. We don't know what exactly version of DataONTAP. Does it matter for FAS250 filer? Do we need to wipe the disk driver before we add to FAS250 filer?

    We have try to add back to FAS250 filer, Using "storage show disk -x". it show:

DISK  SHELF BAY SERIAL       VENDOR   MODEL            REV

----- --------- ------------ -------- ---------------- ------------------------------------------

0b.16   1    0  DH07P870CVJL NETAPP   X276_FAL9E288F10 NA04

0b.17   1    1  DH07P870CVJL NETAPP   X276_FAL9E288F10 NA05

0b.18   1    2  DH07P870CW8S NETAPP   X276_FAL9E288F10 NA05

0b.19   1    3  DH07P870CWKM NETAPP   X276_FAL9E288F10 NA05

0b.20   1    4  DH07P870D0MK NETAPP   X276_FAL9E288F10 NA05

0b.21   1    5  DH07P870CYK1 NETAPP   X276_FAL9E288F10 NA05

0b.22   1    6  DH07P870CWHC NETAPP   X276_FAL9E288F10 NA05

As you can see, we put the used disk driver on Bay0, But the disk driver firmware version is "NA04". It seems like older than on-line disk driver.

Another problem is, when we run "aggr status" command, it show below messages:

nas> aggr status

           Aggr State      Status            Options

          aggr0 online     raid_dp, aggr     root

                           degraded

       aggr0(1) failed     raid_dp, aggr     diskroot, raidsize=28,

                           foreign           lost_write_protect=off

                           partial

   Very strange, it create aggr0(1) automatically. What happen to it? Is it normal?

Y.H.Chen

aborzenkov
16,748 Views

Your replacement disk was part of aggregate aggr0. Now filer imported this aggregate, renamed it to avoid confusion. It is failed because other disks from this aggregate are obviously missing. You need to destroy this foreign aggregate to make new disk spare.

martin_fisher
16,748 Views

Yeap - as i thought. The disk inserted has old metadata, regarding the old Filer. When you inserted the disk, ONTAP has detected this, shown by aggr status, as aborzenkov stated. the foreign aggr needs destroying and the disk inserted in bay0 as a spare disk, zero'ing.

once this is completed your correct aggr 1 should rebuild.

YHCHEN168
16,748 Views

Dear aborzenkov, Martin,

    Thanks for your help!

   We will try to destorying the foreign aggr and zeroing the disk driver ASAP.

   Once more thing, We noticed the Bay0 disk driver firmware version is NA04. It looks like older than other disk drives. Does it matter? We have access "/etc/disk_fw" directory and can not found NA05 firmware. Do we need to upgrade the replacement disk to NA05 version for compatible reason? If so, where can i download disk driver firmware? I have searched Netapp web site, but can not found it. Maybe we do have rights to download it?

Y.H.Chen

YHCHEN168
16,748 Views

Dear all,

    We have followed your suggestions and the filer starting to rebuild the replaced disk. It fininsed with successed. Now everything looks fine, except the  replacement disk dirver firmware is older than others. We will keep looking for disk firmware for further update. Thanks all. It's very appreciated for your help!

Y.H.Chen

martin_fisher
8,517 Views

Hi - you should be able to see the Disk Firmwares currently available on your system, by setting the CLI to advanced and running the following command:

ls /etc/disk_fw

The disk firmware's available on your system are kept in this folder.

You should be able to manually update the disk firmware on the affected disk, using the command:

disk_fw_update

If you run a sysconfig -v this will show you the current disk FW and the disk ID.

Updating the disk Firmware is disruptive, so ideally you want to do this at a quiet period.

Public