2011-02-23 07:01 PM
I'm planning on having some disk trays removed on some upcoming filer maintenance and I want to make sure the disks I'm removing are properly removed from the filer. I have a few aggregates on these trays that I need to delete.
Once I delete the aggregates it looks like the disks will end up in a big pool with other spare disks (based on testing I did in a netapp sim), but in a non-zeroed state (output from aggr status -r).
Looking in the ontap command referece it looks like I can run disk assign -t ATA -s unowned and remove all SATA disks from ownership.
My question is... is it really that easy? The below is some disk output from an aggr status -r (before destroying the aggregates). There are some aggregates and spares that are fiber channel, but I'm looking to remove 5 shelves of sata disks. I assume the type ATA specified below is the type that I would specify in my disk assign command? What is the difference between ATA and SATA in terms of netapp disks? As far as I understood the disks we have in the filer now are sata, which is why I was surprised to only see ATA here.
Am I correct in assuming that I should specify ATA? Are there any gotchas to using a disk assign unowned command with specifying a whole type of disk?
I already checked and auto-assign is turned off by default so the disks should stay unowned once I do this.
I'm a bit trigger shy, so just looking for some extra confirmation in my plan I want to make extra sure that I won't inadvertantly cause some grief with the disks I want to remain online (all FCAL's).
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0c.59 0c 3 11 FC:A - ATA 7200 635555/1301618176 635858/1302238304
parity 4a.90 4a 5 10 FC:B - ATA 7200 635555/1301618176 635858/1302238304
data 4a.75 4a 4 11 FC:B - ATA 7200 635555/1301618176 635858/1302238304
spare 0b.109 0b 6 13 FC:A - FCAL 15000 136000/278528000 137422/281442144
spare 4c.53 4c 3 5 FC:B - FCAL 15000 136000/278528000 137422/281442144
spare 0c.29 0c 1 13 FC:A - ATA 7200 635555/1301618176 635858/1302238304
spare 0c.39 0c 2 7 FC:A - ATA 7200 635555/1301618176 635858/1302238304
spare 0c.54 0c 3 6 FC:A - ATA 7200 635555/1301618176 635858/1302238304
2011-02-23 10:13 PM
I would run "disk zero spares" after destroying the aggregate, then wait for them to zero (1-7 hours depending on drive type). Then I usually run "priv set advanced ; disk remove_ownership x.xx x.xx x.xx ...".. you could also assign to unowned like you listed below which is the same result, but I prefer entering the disk ids (instead of drive type) so I know exactly which disks I am removing ownership on. Disk assign also supports disk id. You have to be very careful as you mentioned with no typos. You could "aggr status -r aggrname" before destroying it, then use that disk list for the remove ownership.
Also, there is no hot removal of disk shelves, so you will need a quick downtime to remove the shelves with zeroed, unassigned from the system.
2011-02-24 05:04 AM
my first instinct is not to use disk type either... something just seems scary about removing all of a "type"
I built a list of ID's from the spares and the disks in the 2 aggregates. I was thinking about doing something like
for diskname in `cat list` do
sudo rsh filername disk assign $diskname -s unowned
since we have rsh set up to run commands against the filers this seems like it could be a nice way to do it. That way I know the list I'm feeding the command is exactly the 70 disks I want to remove, and I don't have to type them out or worry about copy and paste. I would be reluctant to let it just tear through all 70 disks as quick as it can execute. Does the above method sound sane enough?
regarding running disk zero spares. Does this cause a lot of cpu overhead on the filers that could affect other operations? Is there any harm in running that against a new filer that the disks might be put into? I just want to understand what kind of resource issues I might run into after running that command.
2011-02-24 07:05 AM
Looks like a good plan..specify the disk and no mystery of what disks are affected. Instead of waiting for zero, it probably is easier to zero on the new system... I haven't seen performance issues from zeroing, but if you can zero later that is the same result and saves waiting before removal.
Typos Sent on Blackberry Wireless
2011-02-24 03:08 PM
Both remove the ownership. It is preference. I have seen people make typos with -s unowned. A typo with remove_ownership would give a syntax error instead of wrong reassingment (but if they type unowned wrong, who knows if they type the right diskid either). Sometimes (depending on ontap version and if disks don't have ownership removed before connecting to the new system) one method works and the other doesn't, so good to have both methods to try before maintenance mode.
Typos Sent on Blackberry Wireless