2013-05-13 01:19 PM
Ok so both filers need to be down for the duration of the move.
We thought about splitting up the mirror at first but then found another reason to keep them to say the least.
would this be as simple as stopping services on both and powering them down and then bring them up together at once, or one at a time ?
2013-05-13 01:25 PM
cf disable (just to keep them from trying to take over)
halt each head
power everything off
power on disk shelves, wait for initialization
power on heads (can be done at the same time, or one at a time)
2013-05-13 01:28 PM
the funny thing in all of this is that we have 2 vendors giving 2 different opinions on how to do this. one says the cf takeover should be fine and the other tells us to simply break the plex mirrors.
2013-05-13 01:33 PM
Vendors can sometimes be scary...
The guy saying just do a cf takeover is up in the night, or doesn't understand what you're trying to do.
Breaking the plexes IS an option, but (as I've said) still incurs downtime and consistency issues, and you must be able to split the aggrs onto discrete shelves..
2013-05-13 09:11 PM
If you adventurous, you can let your vendors do it. Make sure that you have backup, and that your vendors will be responsible to compensate any loss in case of downtime. May be they know some secret tricks.
2013-05-14 10:35 PM
if you have a 100% mirrored Stretch MetroCluster you can takeover to node 1, move node 2 and node 2 pool 0 as well as node 1 pool 1 shelves to the new datacenter. Hot unplugging shelves is supported under the assumption that the same type of shelf appears again on the exact same hba, you just need to offline the plex and offline the adapter (see shelf guide under hot shelf replacement).
When node 2 is up again you need to resync the aggregates and then you can giveback (be sure to 100% resync before giveback!). then you can rakeover to node 2, move node 1 and node 1 pool 0 as well as node 2 pool 1 to the new location.
I (storage consultant at a netapp partner) have done this procedure on a few occasions. As i said, it must be a fully mirrored MetroCluster and you need to be 100% sure to properly label everything to put the right cables on the right adapters. After moving both nodes i suggest to do yet another takeover/giveback to be on the safe side and to be sure everything works fine.
If you deem this procedure too risky your best chance would be to prepare power, fibre channel and network cables as well as rack mount kits on the new location beforehand to save time. Then shut down the complete system, move it over, recable and boot up again,
As you have never done either procedure before I suggest you to have an experienced netapp consultant onsite to support you if anything goes wrong.
2013-05-24 12:44 PM
Our Netapp rep has sent me TR-3548 and said page 50 documents my scenario.
They advised us to do a cf forcetakeover -d
as per the TR;
13.2 Split-Brain Scenario
The cf forcetakeover -d command previously described allows the surviving site to take over the failed site’s responsibilities without a quorum of disks available at the failed site (normally required).
Once the problem at the failed site is resolved, the administrator must restrict booting of the previously failed node. If access is not restricted, a split-brain scenario might occur. This is the result of the controller at the failed site coming back up and not knowing that there is a takeover situation. It begins servicing data requests while the remote site also continues to serve requests. The result is the possibility of data corruption.
we dont want to shut down both nodes but if we must we will.
2013-05-24 12:53 PM
i would NOT recommend to go for a cf forcetakeover -d as this is not needed at all. Forcetakeover -d is only used in case of a real desaster and you more or less expect the partner to not come back anytime soon.
cf takeover on node1 so it takes over node2
power off node 2
offline pool 0 from node 2
power off pool 0 from node 2
offline pool 1 from node 1
power off pool 1 from node 1
power on pool 0 from node 2
resync pool 0 from node 2
power on pool 1 form node 1
resync pool 1 from node 1
power up node 2
cf giveback on node 1 to give back node 2