ONTAP Hardware
ONTAP Hardware
Thre are 3 disk failures, please advise.
Please find the errors below:
filer02: monitor.shutdown.brokenDisk.pending:warning]: the parity disk and a data disk in RAID group /aggr1/plex0/rg0 are broken. Halting system in 15 hours.
filer02: raid.config.filesystem.disk.failed:error]: File system Disk /aggr1/plex0/rg0/0a.32 Shelf 2 Bay 0 [NETAPP X269_HJUPI01TSSX NA01] S/N [N00V103L] failed.
[filer02: disk.failmsg:error]: Disk 0a.32 (N00V103L): message received, maintenance center recommended.
Fri Jul 7 20:42:36 BNT [filer02: raid.disk.unload.done:info]: Unload of Disk 0a.32 Shelf 2 Bay 0 [NETAPP X269_HJUPI01TSSX NA01] S/N [N00V103L] has completed successfully
Fri Jul 7 20:42:46 BNT [filer02/filer01: iscsi.service.startup:info]: iSCSI service startup
Fri Jul 7 20:42:47 BNT [filer02/filer01: raid.fdr.failed.ok:info]: Disk 0a.35 Shelf 2 Bay 3 [NETAPP X269_HJUPI01TSSX NA01] S/N [N00WLAUL] successfully deleted from spare pool
Fri Jul 7 20:42:47 BNT [filer02/filer01: raid.fdr.failed.ok:info]: Disk 0a.32 Shelf 2 Bay 0 [NETAPP X269_HJUPI01TSSX NA01] S/N [N00V103L] successfully deleted from spare pool
Fri Jul 7 20:42:47 BNT [filer02/filer01: raid.fdr.failed.ok:info]: Disk 0a.33 Shelf 2 Bay 1 [NETAPP X269_HJUPI01TSSX NA01] S/N [N00W75PL] successfully deleted from spare pool
Fri Jul 7 20:42:47 BNT [filer02/filer01: raid.fdr.failed.ok:info]: Disk 0a.40 Shelf 2 Bay 8 [NETAPP X269_HJUPI01TSSX NA01] S/N [N00V0W8L] successfully deleted from spare pool
Hi Ashwin,
As this point you need to get the disks replaced to avoid your controller shutodwn as i see your controller is already running in DEGRADE MODE ( NetApp goes to degrade mode if more than 2 disk failure)
Could you help to post output of commands
vol status -f and vol status -s ?
If this is a cluster mode controller then use command
storage show disk -broken
storage show disk -spare
Thanks,
Nayab
Hi Ashwin,
I see from the logs that takeover failed due to no enough spares, I suggest to replace the failed disk and power on the system again it should come up.
Thanks,
Nayab
Hi Nayab,
Should we replace all the 4 Disks and power up.
Will there be no data lost.
Hi Ashwin,
I can see one spare disk available in the spare pool, but i would suggest replacing all the failed disk and power on the controllers.
Thanks,
Nayab
Hi Nayab,
Thanks for the advice.
However we are unable to access Filer 02 in which there is 4 disks failed.
Will the date in Filer 02 be lost. Please advise
Hi Ashwin,
At this point i cannot confirm that because if all the disks are data disks then you can still rebuild your data from parity. But imagine if two parity disks failed the definetly there will be data loss. That is the reason why i have asked to replace the disks asap and power on the controller and allow Ontap to rebuild all the failed disks, If all rebuilt then you can still have your data.
Thanks,
Nayab
Hi Nayab,
Thanks for the response.
Will replace the disks and update you
Regards,
Ashwin