ONTAP Discussions
ONTAP Discussions
Just replaced a drive, but one of our aggregates is still showing failed disks. How can we get the status back to normal? We have plenty of spares.
RAID Group /aggr2_sas_clp_lcl_fas8020b/plex0/rg1 (double degraded, block checksums, raid_dp)
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- --------------------------- ---- ----- ------ -------- -------- ----------
dparity 3.33.7 0 SAS 15000 546.9GB 547.7GB (normal)
parity 3.32.8 0 SAS 15000 546.9GB 547.1GB (normal)
data 3.33.8 0 SAS 15000 546.9GB 547.7GB (normal)
data 3.32.9 0 SAS 15000 546.9GB 547.1GB (normal)
data 3.33.9 0 SAS 15000 546.9GB 547.7GB (normal)
data FAILED - - - 546.9GB - (failed)
data 3.33.10 0 SAS 15000 546.9GB 547.7GB (normal)
data 3.32.11 0 SAS 15000 546.9GB 547.1GB (normal)
data 3.33.11 0 SAS 15000 546.9GB 547.7GB (normal)
data FAILED - - - 546.9GB - (failed)
data 3.33.12 0 SAS 15000 546.9GB 547.7GB (normal)
data 3.32.13 0 SAS 15000 546.9GB 547.1GB (normal)
data 3.33.13 0 SAS 15000 546.9GB 547.7GB (normal)
data 3.32.14 0 SAS 15000 546.9GB 547.1GB (normal)
data 3.33.14 0 SAS 15000 546.9GB 547.7GB (normal)
Pool0
Spare Pool
Usable Physical
Disk Type Class RPM Checksum Size Size Status
---------------- ------ ----------- ------ -------------- -------- -------- --------
2.22.17 SAS performance 10000 block 836.9GB 838.4GB zeroed
2.22.19 SAS performance 10000 block 836.9GB 838.4GB zeroed
2.23.9 SAS performance 10000 block 836.9GB 838.4GB zeroed
3.30.22 SAS performance 15000 block 546.9GB 547.1GB zeroed
3.31.2 SAS performance 15000 block 546.9GB 547.7GB zeroed
3.32.12 SAS performance 15000 block 546.9GB 547.7GB zeroed
Original Owner: clp-lcl-fas8020b
Pool0
Spare Pool
Usable Physical
Disk Type Class RPM Checksum Size Size Status
---------------- ------ ----------- ------ -------------- -------- -------- --------
2.20.18 SAS performance 10000 block 836.9GB 838.4GB zeroed
2.20.23 SAS performance 10000 block 836.9GB 838.4GB zeroed
2.21.17 SAS performance 10000 block 836.9GB 838.4GB zeroed
3.32.5 SAS performance 15000 block 546.9GB 547.1GB zeroed
3.32.7 SAS performance 15000 block 546.9GB 547.1GB zeroed
3.32.10 SAS performance 15000 block 546.9GB 547.7GB zeroed
3.33.23 SAS performance 15000 block 546.9GB 547.7GB zeroed
1.10.9 SSD solid-state - block 186.1GB 186.3GB zeroed
14 entries were displayed.
Have you tried manual unfailing the disk ? The storage disk unfail command can be used to unfail it.
Following command (in advanced privilage level) shall help you to unfail and make it spare disk.
cluster1::*> storage disk unfail -disk <disk path name> - Disk Name -s true
After the case was escalated with Netapp, this was resolved.
Ended up being locks preventing giveback.
storage failover show-giveback
Partner
Node Aggregate Giveback Status
-------------- ----------------- ---------------------------------------------
<node>
CFO Aggregates Done
aggr2_sas_fas8020b
Failed: Operation was vetoed by
lock_manager. Giveback vetoed: Giveback
cannot proceed because non-continuously
available (non-CA) CIFS locks are present on
the volume. Gracefully close the CIFS
sessions over which non-CA locks are
established. Use the "vserver cifs session
file show -hosting-aggregate <aggregate
list> -continuously-available No" command to
view the open files that have CIFS sessions
with non-CA locks established. <aggregate
list> is the list of aggregates sent home as
a result of the giveback operation. If lock
state disruption for all existing non-CA
locks is acceptable, retry the giveback
operation by specifying "-override-vetoes
true". Warning: Overriding vetoes to
perform a giveback can be disruptive.
Once I overrode vetoes, the aggregate started rebuilding.
storage failover giveback -ofnode <node> -override-vetoes true
hi
You need to replace the faulty disk with a good disk. Run the disk show -n command to check the disk that does not have a home location, add the disk that does not have a home location to the node where the faulty disk resides, and run the command to check whether the faulty disk still exists