Aggregate aggr0 (failed, raid_dp, partial) (block checksums)

eliebeskint · ‎2019-09-12

im new on netapp and i found this error below , how can we fix it ?

you r help is really appreciated

Plex /aggr0/plex0 (offline, failed, inactive)

RAID group /aggr0/plex0/rg0 (partial)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- ----- -------------- --------------

dparity 0a.16 0a 1 0 FC:A - FCAL 15000 136000/278528000 137104/280790184

parity 0a.45 0a 2 13 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.32 0a 2 0 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.33 0a 2 1 FC:A - FCAL 15000 136000/278528000 137104/280790184

data FAILED N/A 136000/278528000

data 0a.35 0a 2 3 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.29 0a 1 13 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.20 0a 1 4 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.36 0a 2 4 FC:A - FCAL 15000 136000/278528000 137104/280790184

data FAILED N/A 136000/278528000

data 0a.37 0a 2 5 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.22 0a 1 6 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.38 0a 2 6 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.23 0a 1 7 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.39 0a 2 7 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.24 0a 1 8 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.40 0a 2 8 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.25 0a 1 9 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.41 0a 2 9 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.26 0a 1 10 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.17 0a 1 1 FC:A - FCAL 10000 136000/278528000 280104/573653840

data 0a.27 0a 1 11 FC:A - FCAL 15000 136000/278528000 137104/280790184

data 0a.43 0a 2 11 FC:A - FCAL 15000 136000/278528000 137104/280790184

data FAILED N/A 136000/278528000

data 0a.44 0a 2 12 FC:A - FCAL 15000 136000/278528000 137104/280790184

Raid group is missing 4 disks.

Spare disks

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- ----- -------------- --------------

Spare disks for block or zoned checksum traditional volumes or aggregates

spare 0a.18 0a 1 2 FC:A - FCAL 15000 136000/278528000 137104/280790184 (not zeroed)

spare 0a.19 0a 1 3 FC:A - FCAL 15000 136000/278528000 137104/280790184 (not zeroed)

spare 0a.21 0a 1 5 FC:A - FCAL 15000 136000/278528000 137104/280790184 (not zeroed)

spare 0a.34 0a 2 2 FC:A - FCAL 15000 136000/278528000 137104/280790184 (not zeroed)

spare 0a.42 0a 2 10 FC:A - FCAL 15000 136000/278528000 137104/280790184 (not zeroed)

spare 0a.28 0a 1 12 FC:A - FCAL 15000 136000/278528000 137422/281442144 (not zeroed)

aborzenkov · ‎2019-09-12

@eliebeskint wrote:

how can we fix it ?

If disks really failed - you cannot (without data loss). It is impossible to recover raid group with 4 failed disks, so the only option is to recreate aggregate and restore data from backup.

To determine whether disks are really failed you should open support case to analyze your system.

eliebeskint · ‎2019-09-12

thanks for your reply , how i can check if i have a backup , is there any command i can run it ?

paul_stejskal · ‎2019-09-18

No that would mean a backup of your data.

You might be able to open a case with us and we can try to help us support the system.

junwang · ‎2024-11-13

Hi,eliebeskint,

There are more than 2 bad disks in an aggregation, causing RAID damage. There are currently 6 spare disks, but their status is not zeroed. In this state, spare disks will not be used. The solution is to zero the spare disk.

::> disk zerospares

After zero is completed, the system will automatically write the data of the faulty disk to the spare disk. After all disks are synchronized to the spare disk, normal operation can be restored. Then, replace the broken disk.