2009-01-23 11:29 AM
In an upcoming maintenance window I'd like to move the data off a shelf of 144GB drives and pull the shelf from the loop and utilize it elsewhere. A couple sanity checks I'd like to put out for comment:
1.) The data I'm moving are luns. As far as I can tell, luns are referenced via /vol/<volume name>/<lun name> so it shouldn't matter what aggregate they live on, right? If I offline the lun and move it to another aggregate, same volume name, and then online the lun again, it should come up, right?
2.) The shelf I want to retire is shelf #4 in a loop of 6 shelves. Can I pull it out of the loop and decrement the shelf id's of shelves 5 and 6 by one, making them shelf 4 and 5, without impact to the data or confusing the head?
2009-01-25 08:13 AM
I am going to attempt to answer this without being too wordy. Good Luck!
1. As long as the igroups, and zoning if FCP, are correct on the new storage system the lun should come back online with no problem. I would without a doubt shut the host down while the migration is being done. So this should be no problem.
2. As long as the entire shelf made up 1 entire aggregate you can destroy that aggregate before moving the shelf and everything should be ok. Where you will have an issue is if there were an aggregate that was defined to use one or more of the disks in this shelf. If you have this situation you can you use the disk replace command to replace individual drives so that none of the drives on the shelf to be moved are part of any aggregate. Naturally I would halt the head before unplugging shelves. You can use sysconfig -r to see which disks make up an aggregate. Hope this helps!
2009-01-26 12:58 AM
as long as you're talking about an OFFLINE task, everything will work fine. Your assumptions on LUNs being independent of the aggregate they live in is correct. And you can safely change the shelf IDs as the filer doesn't care about shelf IDs at all (except that they must be different for each shelf on the same loop, obviously).
However, please make sure that your aggregate is completely contained in the shelf you want to remove, and doesn't have any disks on any other shelfes. If it has, you should swap the disks around first.
If you're talking about an ONLINE task, i.e. removing the shelf while the machine is running, well then I wish you good luck. Such a task is not supported by NetApp and you might crash your filer by trying that. That being said, however, I already did that once (on a non-production machine with little I/O going on at the time) by using the following mechanism:
- Dual-path connect the shelf loop, i.e. connect from the last thelf in your loop back to the filer. The filer then detects a second path to each disk via the second path.
- Wait until I/O has settled somewhat
- offline all volumes/aggregates on the shelf you're about to remove. If you can, destroy the aggregate and zero the disks so that they are spares again (that's what I did, however, simply offlining the aggregate *should* also work)
- now that you have a shelf full of spares, simply unplug it from the loop. All shelves behind it in the loop can still be accessed through the second path. This simply simulates a shelf failure and *shouldn't* crash the filer. However, as I already said, it's unsupported, use at your own risk.
- during the next downtime, change the cabling again back to a single path configuration
2009-01-26 04:09 AM
Thanks for all the help and suggestions. In answer to your biggest concern, this is happening during a data center shutdown. All hosts connecting to this appliance will be down and I will be physically moving this appliance to new racks inside a refrigerated cabinet (anyone have experience with these?) so it will be down as well.