Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Aggregate has failed and cannot be brought online. Raid group is missing 1 disk.
2014-11-17
10:21 AM
19,141 Views
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Greetings, everyone. Any assistance, insight, or advice provided in order to bring a failed aggregrate online would be greatly appreciated. We recently encountered multiple disk failures and all have been replaced. However, it looks as though a media scrubbing and/or zero-ing process on the spare replacement disks must complete before the failed aggregate will consume a spare and reconstruct the partial raid group - is there any truth to this assumption? Some relevant details are outlined, below. Thank you in advance...
NetApp Release 7.3.2P6
FilerView > Aggregates > Manage displays the following status for aggr3: (failed, raid_dp, partial)
Attempting to place the aggregate online displays the following error: Requested operation failed on aggregate 'aggr3': Aggregate 'aggr3' has failed and cannot be brought online.
From a command line, aggr status -r displays:
Aggregate aggr3 (failed, raid_dp, partial) (block checksums)
Plex /aggr3/plex0 (offline, failed, inactive)
RAID group /aggr3/plex0/rg0 (normal)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0a.89 0a 5 9 FC:A - ATA 7200 635555/1301618176 635858/1302238304
parity 0a.75 0a 4 11 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.58 0a 3 10 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.86 0a 5 6 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.48 0a 3 0 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.54 0a 3 6 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.55 0a 3 7 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.81 0a 5 1 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.87 0a 5 7 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.93 0a 5 13 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.90 0a 5 10 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.82 0a 5 2 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.84 0a 5 4 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.92 0a 5 12 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.64 0a 4 0 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.85 0a 5 5 FC:A - ATA 7200 635555/1301618176 635858/1302238304
RAID group /aggr3/plex0/rg1 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 635555/1301618176
parity 0a.49 0a 3 1 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.18 0a 1 2 FC:A - ATA 7200 274400/561971200 274540/562258784
data 0a.45 0a 2 13 FC:A - ATA 7200 423111/866531584 423889/868126304
data 0a.80 0a 5 0 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.66 0a 4 2 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.88 0a 5 8 FC:A - ATA 7200 635555/1301618176 635858/1302238304 (reconstruction 99% completed)
data 0a.83 0a 5 3 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.67 0a 4 3 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.51 0a 3 3 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.91 0a 5 11 FC:A - ATA 7200 635555/1301618176 635858/1302238304 (reconstruction 99% completed)
data 0a.52 0a 3 4 FC:A - ATA 7200 635555/1301618176 635858/1302238304
Raid group is missing 1 disk.
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 0a.65 0a 4 1 FC:A - ATA 7200 635555/1301618176 635858/1302238304
spare 0a.68 0a 4 4 FC:A - ATA 7200 635555/1301618176 635858/1302238304
spare 0a.76 0a 4 12 FC:A - ATA 7200 635555/1301618176 635858/1302238304
aggr media_scrub status displays:
aggr media_scrub /aggr1/plex0/rg0 is 20% complete
aggr media_scrub /aggr2/plex0/rg0 is 12% complete
aggr media_scrub /aggr0/plex0/rg0 is 31% complete
aggr media_scrub 0a.65 is 42% complete
aggr media_scrub 0a.76 is 42% complete
aggr media_scrub 0a.68 is 42% complete
Once the media_scrub completes, will the failed aggregate consume a spare disk and reconstruct the partial raid group?
8 REPLIES 8
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Get to the My Autosupport site, locate the aggregate and disks associated. Find the missing drive and its id and add it back to the aggregate.
You should get back the aggregate online. If not call the support.
thank you
aKG
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Hope this article helps https://kb.netapp.com/support/index?page=content&id=2015763&actp=LIST_RECENT&viewlocale=en_US&searchid=1416288732535
Thanks
If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have triple disk failure. Your only option is to try to bring disk that failed last online and hope it will allow reconstruction to complete. Do not attempt to unfail disk from Data ONTAP. It will make it spare and unsuitable for reconstruction. Open case with NetApp and let them guide you. It is very easy to lose data in this situation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
^^
What he said.
I've had a triple disk failure and it doesn't end well.. Hope you have a good snapmirror copy. I would call support asap,and they probably will have you do a wafl iron, but you need to call support for this issue
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank all of you for your responses, links, and suggestions.
Unfortunately, our Support Agreement/Warranty expired and was not renewed. Ultimately, we executed the steps listed below:
-SnapMirror > Manage > deleted all entries that were directed at the failed offline aggregate and in an unknown state.
-Destroyed failed aggr3
-Added new aggregate (disks initialized and zeroed)
-Created and configured new volumes on the new aggregate
-Created new SnapMirror entries for the new volumes
-Restricted the new volumes and initialized the SnapMirrors
All SnapMirrors are now either transferring or in a snapmirrored state.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
^^^^
You have triple disk failure. Your only option is to try to bring disk that failed last online and hope it will allow reconstruction to complete. Do not attempt to unfail disk from Data ONTAP. It will make it spare and unsuitable for reconstruction. Open case with NetApp and let them guide you. It is very easy to lose data in this situation.
===
Sorry but I cannot see the triple failure and that worries me, all i can see is a single dsik failed in this aggregate:
RAID group /aggr3/plex0/rg1 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 635555/1301618176
parity 0a.49 0a 3 1 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.18 0a 1 2 FC:A - ATA 7200 274400/561971200 274540/562258784
data 0a.45 0a 2 13 FC:A - ATA 7200 423111/866531584 423889/868126304
data 0a.80 0a 5 0 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.66 0a 4 2 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.88 0a 5 8 FC:A - ATA 7200 635555/1301618176 635858/1302238304 (reconstruction 99% completed)
data 0a.83 0a 5 3 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.67 0a 4 3 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.51 0a 3 3 FC:A - ATA 7200 635555/1301618176 635858/1302238304
data 0a.91 0a 5 11 FC:A - ATA 7200 635555/1301618176 635858/1302238304 (reconstruction 99% completed)
data 0a.52 0a 3 4 FC:A - ATA 7200 635555/1301618176 635858/1302238304
Raid group is missing 1 disk.
Where can you see the other 2 failed disks???
Much appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Where can you see the other 2 failed disks???
Those that are currently Reconstructing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
has this issue been solved?
