Simulator Discussions

Way to simulate disk failure and recovery using the simulator

speck1987
5,566 Views

Is there a way to simulate disk failure and recovery operations using the simulator?

I am new to using the simulator, but, I want to use the simulator to train other admins on the proper procedures for disk recovery, ha takeover ops, etc...

Thanks

5 REPLIES 5

SeanHatfield
5,541 Views

There are commands in the nodeshell to simulate removal and insertion of disks which can be used for some of that.

 

disk simpush and disk simpull are avilable in the nodeshell in ONTAP or in the 7mode cli. for example:

 

disk simpull v4.16

 

 

disk simpull simulates the hot removal of a drive, which will force a raid rebuild and consuption of a spare disk.  under the covers, it moves it from

 

/sim/dev/,disks

to

 

/sim/dev/,disks/,pulled

 

When you are ready to put it back, you need its name.  You can list all the pulled disks with:

 

disk simpush -l

 

It will look something like:

 

demo1> disk simpush -l


The following pulled disks are available for pushing:
         v1.32:NETAPP__:VD-1000MB-FZ-520:14370813:2104448
         v0.16:NETAPP__:VD-1000MB-FZ-520:11980700:2104448
demo1> 

 

When you are ready to put it back use its full name on the simpush command:

 

demo1> disk simpush v0.16:NETAPP__:VD-1000MB-FZ-520:11980700:2104448

 

The simpull and simpush commands only work on simulators that use file based simulated disks.

 

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

speck1987
5,457 Views

Thanks Sean,

 

I tried the commands you suggested and they worked great. I can use this to help train folks on what to do if/when the real situation appears, I know its just a matter of time.  It is interesting to note that after doing a simpull on a disk say id v4.16, then, running a simpush -l that that particular id v4.16, is not available to push.  I not sure why, but, I guess it reassigns a new disk id once it is pulled.  Thanks for the info, it will help me to design something for training using the simulator.   

 

Cheers..

SeanHatfield
5,425 Views

Great.  What you noticed when you pull a disk on v4 is that thats actually the 2nd path to the disk on v0.  There are 8 virtual host adapters v0-v7 which look like FC initiators.  The disks are created on v0-v3, and the 2nd paths to those virtual shelves are on v4-v7, giving a simulated multipath storage layer.  

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

andyberry
5,487 Views

When we comission new filers we use the following commands to fail a disk

Note: we are running ONTAP 8.3.1 or newer.

 

Test – Disk Failure

Run the command

 

storage disk fail –disk {disk_name} –immediate

 

The disk fails and the storage system will operate in degraded mode until the RAID system reconstructs from a hot spare.

 

You can see that the disk is shown as failed using the command

                storage disk show -broken

 

To unfail the disk and return it back into the pool of spares

                priv set advanced

                storage disk unfail -s {disk name}

                priv set admin

                storage disk show -spare

speck1987
5,450 Views

Thanks for the reply Andy,

 

I tested the commands and they will work great. I did have to make some changes in the commands, the storage disk fail -disk {disk_name} –immediate commands didn't work for my version which is an earlier one (8.0.1 7- mode).  Instead I used the disk fail -i {disk name} and disk unfail -s {disk name} (BTW disk unfail is only available in priv advanced mode) to accomplish the same result.

 

I can use your info combined with Sean's post to give trainies an good idea of what an actual disk failure looks like and how to go about fixing it.

 

Thanks Andy.

 

Public