ONTAP Discussions

Hot Spares?

SATHMP4
8,324 Views

Hi,

Would someone be able to help me understand the system behavior in regards to having 1 hot spare and 2 hot spares for a raid group?

For example , if you have 1 hot spare and a disk fails, what happens now that you have no more hot spares? What would then happen if you have a second disk failure?

If you have two hot spares, can you have both parity disks in the raid group fail and still be able to rebuild the raid group with the 2 hot spares?

Many Thanks

Simon

1 ACCEPTED SOLUTION

peter_lehmann
8,324 Views

Hot Spares are not per raidgroup, they are per controller.

having 2 hot spares (per disktype) enables the Disk Maintenance Center, allowing Ontap to check and repair minor issues with a disk internally. Without sending a replacement disk and have you to send the "broken disk" back to netapp. having just one hot spare disables this feature. having just one hot spare, also disables the NDU Firmware Upgrade for disks...

if you have 1 hot spare and one disk fails, the raidmanager rebuilds the failed disk to the hot spare. the controller then alerts you repeatedly about the fact that it is running "SPARES_LOW" until you added the new replacement disk, which will be a spare disk automatically. If another disk fails during the rebuild you are still protected from loosing data, thats exactly what RAID-DP is for. With RAID-DP you can loose 2 disks per raidgroup and still keep running.

For RAID-DP it does not matter if you loose both parity disks or data disks or any combination thereof. If you loose both parity, it will just recompute the parity to two Hot Spares. If you loose two data disks, it will recompute the data from the parity and so on...

hope this makes sense

Peter

View solution in original post

9 REPLIES 9

peter_lehmann
8,325 Views

Hot Spares are not per raidgroup, they are per controller.

having 2 hot spares (per disktype) enables the Disk Maintenance Center, allowing Ontap to check and repair minor issues with a disk internally. Without sending a replacement disk and have you to send the "broken disk" back to netapp. having just one hot spare disables this feature. having just one hot spare, also disables the NDU Firmware Upgrade for disks...

if you have 1 hot spare and one disk fails, the raidmanager rebuilds the failed disk to the hot spare. the controller then alerts you repeatedly about the fact that it is running "SPARES_LOW" until you added the new replacement disk, which will be a spare disk automatically. If another disk fails during the rebuild you are still protected from loosing data, thats exactly what RAID-DP is for. With RAID-DP you can loose 2 disks per raidgroup and still keep running.

For RAID-DP it does not matter if you loose both parity disks or data disks or any combination thereof. If you loose both parity, it will just recompute the parity to two Hot Spares. If you loose two data disks, it will recompute the data from the parity and so on...

hope this makes sense

Peter

SATHMP4
8,324 Views

Thank you Peter - what a fantastic response!

One last question just to clarify my understanding of raid DP; if you have no hot spares and you have 2 disk failures, will the raid group continue to run?

Thanks

HARI_KRISHNA981
8,324 Views

hi simon,

have a look at this url..which explains raid -dp. You may find it useful

https://communities.netapp.com/docs/DOC-12850

Regards,

Hari

peter_lehmann
8,324 Views

yes it is keeps running, but just ONE more error or failure and you loose data!

aborzenkov
8,324 Views

having just one hot spare, also disables the NDU Firmware Upgrade for disks...

Are you sure? It is the first time I hear it. Could you please provide a link to this information?

peter_lehmann
8,324 Views

was the first time for me too. is in the student slides of the clustered ontap admin course 8.2... thought if this is not the case someone will interfere

the best filter is the community

aborzenkov
8,324 Views

Well, C-Mode may behave differently. I could not find any description of background disk firmware update in either manuals or KB, except short notice about bkg-firmware-update option. But nowhere about requirements for it.

Was it WBT or ILT?

Would be nice is someone from NetApp chimed in.

P.S. today one really should qualify every statement with Data ONTAP mode to which it applies … ☺

peter_lehmann
8,324 Views

It was ILT... waiting 4 netapp to chime in 2...

radek_kubka
8,324 Views

I'm on the same page with Andrey - I've never heard that hot spares count may affect disk FW upgrades.

However, used RAID level can make a difference, but only *if* ONTAP version is lower than 8.0.2:

https://library.netapp.com/ecmdocs/ECMM1253884/html/upgrade/GUID-1A70BD32-D54D-443F-9E5E-C97D8E420189.html

Regards,
Radek

Public