Simulator Discussions
Simulator Discussions
Is there a way to simulate disk failure and recovery operations using the simulator?
I am new to using the simulator, but, I want to use the simulator to train other admins on the proper procedures for disk recovery, ha takeover ops, etc...
Thanks
There are commands in the nodeshell to simulate removal and insertion of disks which can be used for some of that.
disk simpush and disk simpull are avilable in the nodeshell in ONTAP or in the 7mode cli. for example:
disk simpull v4.16
disk simpull simulates the hot removal of a drive, which will force a raid rebuild and consuption of a spare disk. under the covers, it moves it from
/sim/dev/,disks
to
/sim/dev/,disks/,pulled
When you are ready to put it back, you need its name. You can list all the pulled disks with:
disk simpush -l
It will look something like:
demo1> disk simpush -l The following pulled disks are available for pushing: v1.32:NETAPP__:VD-1000MB-FZ-520:14370813:2104448 v0.16:NETAPP__:VD-1000MB-FZ-520:11980700:2104448 demo1>
When you are ready to put it back use its full name on the simpush command:
demo1> disk simpush v0.16:NETAPP__:VD-1000MB-FZ-520:11980700:2104448
The simpull and simpush commands only work on simulators that use file based simulated disks.
Thanks Sean,
I tried the commands you suggested and they worked great. I can use this to help train folks on what to do if/when the real situation appears, I know its just a matter of time. It is interesting to note that after doing a simpull on a disk say id v4.16, then, running a simpush -l that that particular id v4.16, is not available to push. I not sure why, but, I guess it reassigns a new disk id once it is pulled. Thanks for the info, it will help me to design something for training using the simulator.
Cheers..
Great. What you noticed when you pull a disk on v4 is that thats actually the 2nd path to the disk on v0. There are 8 virtual host adapters v0-v7 which look like FC initiators. The disks are created on v0-v3, and the 2nd paths to those virtual shelves are on v4-v7, giving a simulated multipath storage layer.
When we comission new filers we use the following commands to fail a disk
Note: we are running ONTAP 8.3.1 or newer.
Test – Disk Failure
Run the command
storage disk fail –disk {disk_name} –immediate
The disk fails and the storage system will operate in degraded mode until the RAID system reconstructs from a hot spare.
You can see that the disk is shown as failed using the command
storage disk show -broken
To unfail the disk and return it back into the pool of spares
priv set advanced
storage disk unfail -s {disk name}
priv set admin
storage disk show -spare
Thanks for the reply Andy,
I tested the commands and they will work great. I did have to make some changes in the commands, the storage disk fail -disk {disk_name} –immediate commands didn't work for my version which is an earlier one (8.0.1 7- mode). Instead I used the disk fail -i {disk name} and disk unfail -s {disk name} (BTW disk unfail is only available in priv advanced mode) to accomplish the same result.
I can use your info combined with Sean's post to give trainies an good idea of what an actual disk failure looks like and how to go about fixing it.
Thanks Andy.