Subscribe

MetroCluster - adding existing shelves to ATTO stack

Hello, all!

We have a Stretch MetroCluster with both mirrored and non-mirrored aggregates at the moment (in transition). All aggregates are production. During the next downtime the plan is to reconfigure the system so that all aggregates are mirrored, by introducing the non-mirrored SAS shelves into the ATTO stacks on both nodes.

Since this is not a hot-add scenario, this should be relatively easy to accomplish. But my experience in this matter is limited and it's worth running a couple of points by the community for confirmation/comments:

----

1. When adding an existing non-mirorred SAS shelf to existing ATTO stack residing on the same node, with plans for that shelf to remain in pool0 (local disk), other than the automatic change of disk paths - will anything else change? Aggregate properties, data volumes, disk ownership, pool assignment should remain identical - making the aggregate will be readily accessible, correct?

2. Same game plan as 1, but this time with plans for the SAS shelf to become a part of pool1 (remote disks for partner node). Other than going into Maintenance mode and reassigning disks to pool1, is there anything else that needs attention?

----

3. When adding an existing non-mirorred SAS shelf to existing ATTO stack residing, but this time on the different node, with plans for that shelf to remain in pool0 (local disk) - other then changing disk ownership is  there anything else to keep in mind?

4. Similarly, same game plan as 3, with plans for the SAS shelf to become a part of pool1 (remote disks for partner node). That would imply having to change both disk ownership and pool assignment, right?

----

Please share your thoughts!

Thanks!


Re: MetroCluster - adding existing shelves to ATTO stack

Why is it not hot add scenario? It was always possible to hot add shelves.

Make sure that option disk.auto_assign (and disk.auto_assign_shelf if applicable to your version) is disabled and manually assign disks to required controller and pool. Then re-enable options if required.

Re: MetroCluster - adding existing shelves to ATTO stack

Hiya!

Hot-adding is never a problem, I agree.  But this cannot qualify as hot-add scenario because for each node, the non-mirrored shelves are already attached traditionally, by use of SAS ports, independent of the ATTO stack which is connected through FC adapters. And DOT 8.1.x. doesn't support hot-remove (prior to hot-add).

Re: MetroCluster - adding existing shelves to ATTO stack

1. so this is just detaching the shelf from a local SAS port and attaching it to the ATTO bridge? this should even work online, by reconnecting one IOM after the other.

it's like hot recabling a shelf: as long as one path is available, it's still up and running. the ATTO shouldn't have an issue with that. it will send a few ASUPs about path

redundancy and miswired shelfs, but should stay online. but you need cables which are long enough

2. as you want to assign it to pool 1, I assume you don't want to keep the data/aggregate on this shelf. so destroy the aggr on the current node, zero the disks and remove the ownership

(priv set diag, disk remove_ownership), recable the shelf and do "disk assign -p 1 -shelf x" from the partner. no need to go to maintenance mode, you can do this online.

3. and 4. the pool assignment is done when assigning the disks to the node. so 3. and 4. is basically the same, just with another pool (disk assign -p 0 / disk assign -p 1).

except if you want to keep the data on it. but I doubt it, as enabling a mirror will destroy the destination pool anyways.

two things to remember:

before beginning, do "options disk.auto_assign off" and don't forget to enable it afterwards.

after finishing, do a takeover/giveback to both sides. this will clean up some stuff internally after the whole recabling mess.

Re: MetroCluster - adding existing shelves to ATTO stack

1. so this is just detaching the shelf from a local SAS port and attaching it to the ATTO bridge? this should even work online,

Could you provide link to NetApp document that confirms it? Hot shelf relocation was never supported. From Storage Subsystem Technical FAQ:

Nondisruptive relocation or removal of storage shelves is a process referred to as single-shelf removal (SSR)
and is not currently supported by NetApp.

Re: MetroCluster - adding existing shelves to ATTO stack

I don't know if there is a document or if it is even officially supported. but I've done it and it works.

a MetroCluster is designed to loose disks/shelfs in a disaster and find them later again to resume it's operation.

once I even changed the shelf ID and powercycled it without disruption (I've got the ok from Netapp to do it).

same as hot removal of a shelf: it is not officially supported, but it's done anyways (not only by me, even by Netapp).

last week I physically moved a site with one node and all shelfs from one datacenter to another, nondisruptively.

that's basically a hot-removal of half of the shelfs and reattaching them again.

just make sure to do takeover/giveback at the end to both sides to cleanup the shelf/disk registration.

Re: MetroCluster - adding existing shelves to ATTO stack

The downtime we have is planned for physical relocation of more resources than just storage, so going offline is OK. But I find your input very interesting, we just may stay online, if nothing else then for some business critical processes if this is really feasible...

1. Correct, it's hot-shelf relocation from SAS stack to ATTO stack - on the same node. I can't say I've heard of this procedure before though it does make sense. The shelf will never actually disappear from the node's inventory... hence, there would be no reason for panic?

2. Yup, pool1 is for mirrors so all residing data can be removed. Good point, could be done online.

3. In this case, I do want to keep data on the shelf. I just want to change the owner (node) and stack. However since hot-removal of a shelf is not supported because the system would panic, this would definitely have be done offline.

4. Same as 3 (has to be done offline) except being destination aggregate, I can zero the disks and remove ownership to make it easier.

Re: MetroCluster - adding existing shelves to ATTO stack

Yes, but this only applies for mirrored aggregates (shelves) only. Whether an aggregate is offlined for whatever reason, underlying shelf powered off, power recycled or hot-removed, ATTO connections broken - there's always that other copy... And in case of site failure, forced failover saves the day.

But the SAS shelves I'm talking about are non-mirrored. There is no other copy. And in case of a takeover, they're not accessible.

Thanks for the takeover/giveback tip, that's a part of resiliency test we plan to take anyway to make sure it's really redundant in all aspects.

Re: MetroCluster - adding existing shelves to ATTO stack

1.  hence, there would be no reason for panic?

yes, it shouldn't panic. recently I hot recabled some stacks on an MC to other FC ports (port after port).

so it found the same shelfs on another path. imo it doesn't matter if the path to the shelf goes through an ATTO or directly over SAS.

3. In this case, I do want to keep data on the shelf. I just want to change the owner (node) and stack. However since hot-removal of a shelf is not supported because the system would panic, this would definitely have be done offline.

yes, if you want to keep the data, then you have to change the ownership in maintenance mode.

4. Same as 3 (has to be done offline) except being destination aggregate, I can zero the disks and remove ownership to make it easier.

if you zero the disks, then the system does not panic on a hot-removal, because it is not a multi disk failure then.

on a standard HA system it would panic when re-adding the same shelf again without doing a reboot first. but a MC does not panic, it sees it's old shelf again just like after a site failure.

Re: MetroCluster - adding existing shelves to ATTO stack

if you zero the disks, then the system does not panic on a hot-removal, because it is not a multi disk failure then.

on a standard HA system it would panic when re-adding the same shelf again without doing a reboot first. but a MC does not panic, it sees it's old shelf again just like after a site failure.

Thanks!

One more thing, you wrote: "before beginning, do "options disk.auto_assign off" and don't forget to enable it afterwards." Why is enabling auto_assign necessary once all the disks have been assigned to proper pools and all the shelves are mirrored?