FAS and V-Series Storage Systems Discussions

Highlighted

MetroCluster - adding existing shelves to ATTO stack

Hello, all!

We have a Stretch MetroCluster with both mirrored and non-mirrored aggregates at the moment (in transition). All aggregates are production. During the next downtime the plan is to reconfigure the system so that all aggregates are mirrored, by introducing the non-mirrored SAS shelves into the ATTO stacks on both nodes.

Since this is not a hot-add scenario, this should be relatively easy to accomplish. But my experience in this matter is limited and it's worth running a couple of points by the community for confirmation/comments:

----

1. When adding an existing non-mirorred SAS shelf to existing ATTO stack residing on the same node, with plans for that shelf to remain in pool0 (local disk), other than the automatic change of disk paths - will anything else change? Aggregate properties, data volumes, disk ownership, pool assignment should remain identical - making the aggregate will be readily accessible, correct?

2. Same game plan as 1, but this time with plans for the SAS shelf to become a part of pool1 (remote disks for partner node). Other than going into Maintenance mode and reassigning disks to pool1, is there anything else that needs attention?

----

3. When adding an existing non-mirorred SAS shelf to existing ATTO stack residing, but this time on the different node, with plans for that shelf to remain in pool0 (local disk) - other then changing disk ownership is  there anything else to keep in mind?

4. Similarly, same game plan as 3, with plans for the SAS shelf to become a part of pool1 (remote disks for partner node). That would imply having to change both disk ownership and pool assignment, right?

----

Please share your thoughts!

Thanks!


23 REPLIES 23
Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

Why is it not hot add scenario? It was always possible to hot add shelves.

Make sure that option disk.auto_assign (and disk.auto_assign_shelf if applicable to your version) is disabled and manually assign disks to required controller and pool. Then re-enable options if required.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

Hiya!

Hot-adding is never a problem, I agree.  But this cannot qualify as hot-add scenario because for each node, the non-mirrored shelves are already attached traditionally, by use of SAS ports, independent of the ATTO stack which is connected through FC adapters. And DOT 8.1.x. doesn't support hot-remove (prior to hot-add).

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

1. so this is just detaching the shelf from a local SAS port and attaching it to the ATTO bridge? this should even work online, by reconnecting one IOM after the other.

it's like hot recabling a shelf: as long as one path is available, it's still up and running. the ATTO shouldn't have an issue with that. it will send a few ASUPs about path

redundancy and miswired shelfs, but should stay online. but you need cables which are long enough

2. as you want to assign it to pool 1, I assume you don't want to keep the data/aggregate on this shelf. so destroy the aggr on the current node, zero the disks and remove the ownership

(priv set diag, disk remove_ownership), recable the shelf and do "disk assign -p 1 -shelf x" from the partner. no need to go to maintenance mode, you can do this online.

3. and 4. the pool assignment is done when assigning the disks to the node. so 3. and 4. is basically the same, just with another pool (disk assign -p 0 / disk assign -p 1).

except if you want to keep the data on it. but I doubt it, as enabling a mirror will destroy the destination pool anyways.

two things to remember:

before beginning, do "options disk.auto_assign off" and don't forget to enable it afterwards.

after finishing, do a takeover/giveback to both sides. this will clean up some stuff internally after the whole recabling mess.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

1. so this is just detaching the shelf from a local SAS port and attaching it to the ATTO bridge? this should even work online,

Could you provide link to NetApp document that confirms it? Hot shelf relocation was never supported. From Storage Subsystem Technical FAQ:

Nondisruptive relocation or removal of storage shelves is a process referred to as single-shelf removal (SSR)
and is not currently supported by NetApp.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

I don't know if there is a document or if it is even officially supported. but I've done it and it works.

a MetroCluster is designed to loose disks/shelfs in a disaster and find them later again to resume it's operation.

once I even changed the shelf ID and powercycled it without disruption (I've got the ok from Netapp to do it).

same as hot removal of a shelf: it is not officially supported, but it's done anyways (not only by me, even by Netapp).

last week I physically moved a site with one node and all shelfs from one datacenter to another, nondisruptively.

that's basically a hot-removal of half of the shelfs and reattaching them again.

just make sure to do takeover/giveback at the end to both sides to cleanup the shelf/disk registration.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

The downtime we have is planned for physical relocation of more resources than just storage, so going offline is OK. But I find your input very interesting, we just may stay online, if nothing else then for some business critical processes if this is really feasible...

1. Correct, it's hot-shelf relocation from SAS stack to ATTO stack - on the same node. I can't say I've heard of this procedure before though it does make sense. The shelf will never actually disappear from the node's inventory... hence, there would be no reason for panic?

2. Yup, pool1 is for mirrors so all residing data can be removed. Good point, could be done online.

3. In this case, I do want to keep data on the shelf. I just want to change the owner (node) and stack. However since hot-removal of a shelf is not supported because the system would panic, this would definitely have be done offline.

4. Same as 3 (has to be done offline) except being destination aggregate, I can zero the disks and remove ownership to make it easier.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

Yes, but this only applies for mirrored aggregates (shelves) only. Whether an aggregate is offlined for whatever reason, underlying shelf powered off, power recycled or hot-removed, ATTO connections broken - there's always that other copy... And in case of site failure, forced failover saves the day.

But the SAS shelves I'm talking about are non-mirrored. There is no other copy. And in case of a takeover, they're not accessible.

Thanks for the takeover/giveback tip, that's a part of resiliency test we plan to take anyway to make sure it's really redundant in all aspects.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

1.  hence, there would be no reason for panic?

yes, it shouldn't panic. recently I hot recabled some stacks on an MC to other FC ports (port after port).

so it found the same shelfs on another path. imo it doesn't matter if the path to the shelf goes through an ATTO or directly over SAS.

3. In this case, I do want to keep data on the shelf. I just want to change the owner (node) and stack. However since hot-removal of a shelf is not supported because the system would panic, this would definitely have be done offline.

yes, if you want to keep the data, then you have to change the ownership in maintenance mode.

4. Same as 3 (has to be done offline) except being destination aggregate, I can zero the disks and remove ownership to make it easier.

if you zero the disks, then the system does not panic on a hot-removal, because it is not a multi disk failure then.

on a standard HA system it would panic when re-adding the same shelf again without doing a reboot first. but a MC does not panic, it sees it's old shelf again just like after a site failure.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

if you zero the disks, then the system does not panic on a hot-removal, because it is not a multi disk failure then.

on a standard HA system it would panic when re-adding the same shelf again without doing a reboot first. but a MC does not panic, it sees it's old shelf again just like after a site failure.

Thanks!

One more thing, you wrote: "before beginning, do "options disk.auto_assign off" and don't forget to enable it afterwards." Why is enabling auto_assign necessary once all the disks have been assigned to proper pools and all the shelves are mirrored?

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

Why is enabling auto_assign necessary once all the disks have been assigned to proper pools and all the shelves are mirrored?

for future disk failures. otherwise you need to assign them manually each time after replacing a failed disk.

you may also set disk.auto_assign_shelf to on because of the ATTOs.

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

We have a Stretch MetroCluster so there's not far to walk even if we have to do everything by console access if push comes to shove, but I'll keep in mind - disk assignment on shelf level seems the way to go. But as for disk.auto_assign_shelf, I am not seeing it when I list out "options disk"...

netapp> options disk.

disk.asup_on_mp_loss         on         (value might be overwritten in takeover)

disk.auto_assign             on         (value might be overwritten in takeover)

disk.inject.verbose.enable   off        (value might be overwritten in takeover)

disk.latency_check.enable    on         (value might be overwritten in takeover)

disk.maint_center.allowed_entries 1          (value might be overwritten in takeover)

disk.maint_center.enable     on         (value might be overwritten in takeover)

disk.maint_center.max_disks  84         (value might be overwritten in takeover)

disk.maint_center.rec_allowed_entries 5          (value might be overwritten in takeover)

disk.maint_center.spares_check on         (value might be overwritten in takeover)

disk.powercycle.enable       on         (value might be overwritten in takeover)

disk.recovery_needed.count   5          (value might be overwritten in takeover)


Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

disk.auto_assign_shelf is pretty new. I think it came with Ontap 8.2

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

Ah. Never mind then...

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

One more thing guys...!

What would happen if the system is online and I add some shelves, one of which used to hold the root aggregate (and root volume) for it's previous owner? Would there be any issues or would the system manage the situation?

Highlighted

Re: MetroCluster - adding existing shelves to ATTO stack

if you do it online, then it will zero the disks anyway before adding to/creating a new aggregate.

when changing disk ownership offline (in maintenance mode), then I don't know what will happen. it may find two volumes/aggregates with the root flag and maybe won't boot anymore or even on the wrong root.

if that happens, then you would need to manually pick the root volume on the boot menu (vol_pick_root <volname> <aggr-name>).

Check out the KB!
NetApp Insights To Action
All Community Forums