ONTAP Hardware

MetroCluster - adding existing shelves to ATTO stack

IGORSTOJNOV
8,472 Views

Hello, all!

We have a Stretch MetroCluster with both mirrored and non-mirrored aggregates at the moment (in transition). All aggregates are production. During the next downtime the plan is to reconfigure the system so that all aggregates are mirrored, by introducing the non-mirrored SAS shelves into the ATTO stacks on both nodes.

Since this is not a hot-add scenario, this should be relatively easy to accomplish. But my experience in this matter is limited and it's worth running a couple of points by the community for confirmation/comments:

----

1. When adding an existing non-mirorred SAS shelf to existing ATTO stack residing on the same node, with plans for that shelf to remain in pool0 (local disk), other than the automatic change of disk paths - will anything else change? Aggregate properties, data volumes, disk ownership, pool assignment should remain identical - making the aggregate will be readily accessible, correct?

2. Same game plan as 1, but this time with plans for the SAS shelf to become a part of pool1 (remote disks for partner node). Other than going into Maintenance mode and reassigning disks to pool1, is there anything else that needs attention?

----

3. When adding an existing non-mirorred SAS shelf to existing ATTO stack residing, but this time on the different node, with plans for that shelf to remain in pool0 (local disk) - other then changing disk ownership is  there anything else to keep in mind?

4. Similarly, same game plan as 3, with plans for the SAS shelf to become a part of pool1 (remote disks for partner node). That would imply having to change both disk ownership and pool assignment, right?

----

Please share your thoughts!

Thanks!


23 REPLIES 23

aborzenkov
7,823 Views

Why is it not hot add scenario? It was always possible to hot add shelves.

Make sure that option disk.auto_assign (and disk.auto_assign_shelf if applicable to your version) is disabled and manually assign disks to required controller and pool. Then re-enable options if required.

IGORSTOJNOV
7,823 Views

Hiya!

Hot-adding is never a problem, I agree.  But this cannot qualify as hot-add scenario because for each node, the non-mirrored shelves are already attached traditionally, by use of SAS ports, independent of the ATTO stack which is connected through FC adapters. And DOT 8.1.x. doesn't support hot-remove (prior to hot-add).

DOMINIC_WYSS
8,083 Views

1. so this is just detaching the shelf from a local SAS port and attaching it to the ATTO bridge? this should even work online, by reconnecting one IOM after the other.

it's like hot recabling a shelf: as long as one path is available, it's still up and running. the ATTO shouldn't have an issue with that. it will send a few ASUPs about path

redundancy and miswired shelfs, but should stay online. but you need cables which are long enough

2. as you want to assign it to pool 1, I assume you don't want to keep the data/aggregate on this shelf. so destroy the aggr on the current node, zero the disks and remove the ownership

(priv set diag, disk remove_ownership), recable the shelf and do "disk assign -p 1 -shelf x" from the partner. no need to go to maintenance mode, you can do this online.

3. and 4. the pool assignment is done when assigning the disks to the node. so 3. and 4. is basically the same, just with another pool (disk assign -p 0 / disk assign -p 1).

except if you want to keep the data on it. but I doubt it, as enabling a mirror will destroy the destination pool anyways.

two things to remember:

before beginning, do "options disk.auto_assign off" and don't forget to enable it afterwards.

after finishing, do a takeover/giveback to both sides. this will clean up some stuff internally after the whole recabling mess.

aborzenkov
8,083 Views

1. so this is just detaching the shelf from a local SAS port and attaching it to the ATTO bridge? this should even work online,

Could you provide link to NetApp document that confirms it? Hot shelf relocation was never supported. From Storage Subsystem Technical FAQ:

Nondisruptive relocation or removal of storage shelves is a process referred to as single-shelf removal (SSR)
and is not currently supported by NetApp.

DOMINIC_WYSS
8,082 Views

I don't know if there is a document or if it is even officially supported. but I've done it and it works.

a MetroCluster is designed to loose disks/shelfs in a disaster and find them later again to resume it's operation.

once I even changed the shelf ID and powercycled it without disruption (I've got the ok from Netapp to do it).

same as hot removal of a shelf: it is not officially supported, but it's done anyways (not only by me, even by Netapp).

last week I physically moved a site with one node and all shelfs from one datacenter to another, nondisruptively.

that's basically a hot-removal of half of the shelfs and reattaching them again.

just make sure to do takeover/giveback at the end to both sides to cleanup the shelf/disk registration.

IGORSTOJNOV
7,823 Views

Yes, but this only applies for mirrored aggregates (shelves) only. Whether an aggregate is offlined for whatever reason, underlying shelf powered off, power recycled or hot-removed, ATTO connections broken - there's always that other copy... And in case of site failure, forced failover saves the day.

But the SAS shelves I'm talking about are non-mirrored. There is no other copy. And in case of a takeover, they're not accessible.

Thanks for the takeover/giveback tip, that's a part of resiliency test we plan to take anyway to make sure it's really redundant in all aspects.

IGORSTOJNOV
8,083 Views

The downtime we have is planned for physical relocation of more resources than just storage, so going offline is OK. But I find your input very interesting, we just may stay online, if nothing else then for some business critical processes if this is really feasible...

1. Correct, it's hot-shelf relocation from SAS stack to ATTO stack - on the same node. I can't say I've heard of this procedure before though it does make sense. The shelf will never actually disappear from the node's inventory... hence, there would be no reason for panic?

2. Yup, pool1 is for mirrors so all residing data can be removed. Good point, could be done online.

3. In this case, I do want to keep data on the shelf. I just want to change the owner (node) and stack. However since hot-removal of a shelf is not supported because the system would panic, this would definitely have be done offline.

4. Same as 3 (has to be done offline) except being destination aggregate, I can zero the disks and remove ownership to make it easier.

DOMINIC_WYSS
8,083 Views

1.  hence, there would be no reason for panic?

yes, it shouldn't panic. recently I hot recabled some stacks on an MC to other FC ports (port after port).

so it found the same shelfs on another path. imo it doesn't matter if the path to the shelf goes through an ATTO or directly over SAS.

3. In this case, I do want to keep data on the shelf. I just want to change the owner (node) and stack. However since hot-removal of a shelf is not supported because the system would panic, this would definitely have be done offline.

yes, if you want to keep the data, then you have to change the ownership in maintenance mode.

4. Same as 3 (has to be done offline) except being destination aggregate, I can zero the disks and remove ownership to make it easier.

if you zero the disks, then the system does not panic on a hot-removal, because it is not a multi disk failure then.

on a standard HA system it would panic when re-adding the same shelf again without doing a reboot first. but a MC does not panic, it sees it's old shelf again just like after a site failure.

IGORSTOJNOV
8,083 Views

if you zero the disks, then the system does not panic on a hot-removal, because it is not a multi disk failure then.

on a standard HA system it would panic when re-adding the same shelf again without doing a reboot first. but a MC does not panic, it sees it's old shelf again just like after a site failure.

Thanks!

One more thing, you wrote: "before beginning, do "options disk.auto_assign off" and don't forget to enable it afterwards." Why is enabling auto_assign necessary once all the disks have been assigned to proper pools and all the shelves are mirrored?

DOMINIC_WYSS
6,938 Views

Why is enabling auto_assign necessary once all the disks have been assigned to proper pools and all the shelves are mirrored?

for future disk failures. otherwise you need to assign them manually each time after replacing a failed disk.

you may also set disk.auto_assign_shelf to on because of the ATTOs.

IGORSTOJNOV
6,938 Views

We have a Stretch MetroCluster so there's not far to walk even if we have to do everything by console access if push comes to shove, but I'll keep in mind - disk assignment on shelf level seems the way to go. But as for disk.auto_assign_shelf, I am not seeing it when I list out "options disk"...

netapp> options disk.

disk.asup_on_mp_loss         on         (value might be overwritten in takeover)

disk.auto_assign             on         (value might be overwritten in takeover)

disk.inject.verbose.enable   off        (value might be overwritten in takeover)

disk.latency_check.enable    on         (value might be overwritten in takeover)

disk.maint_center.allowed_entries 1          (value might be overwritten in takeover)

disk.maint_center.enable     on         (value might be overwritten in takeover)

disk.maint_center.max_disks  84         (value might be overwritten in takeover)

disk.maint_center.rec_allowed_entries 5          (value might be overwritten in takeover)

disk.maint_center.spares_check on         (value might be overwritten in takeover)

disk.powercycle.enable       on         (value might be overwritten in takeover)

disk.recovery_needed.count   5          (value might be overwritten in takeover)


DOMINIC_WYSS
6,938 Views

disk.auto_assign_shelf is pretty new. I think it came with Ontap 8.2

IGORSTOJNOV
6,938 Views

Ah. Never mind then...

IGORSTOJNOV
6,938 Views

One more thing guys...!

What would happen if the system is online and I add some shelves, one of which used to hold the root aggregate (and root volume) for it's previous owner? Would there be any issues or would the system manage the situation?

DOMINIC_WYSS
6,938 Views

if you do it online, then it will zero the disks anyway before adding to/creating a new aggregate.

when changing disk ownership offline (in maintenance mode), then I don't know what will happen. it may find two volumes/aggregates with the root flag and maybe won't boot anymore or even on the wrong root.

if that happens, then you would need to manually pick the root volume on the boot menu (vol_pick_root <volname> <aggr-name>).

IGORSTOJNOV
5,887 Views

I was actually thinking about shelves with data I'd like to keep. I added them online, changed ownership and the system automatically offlined the aggregate - as a precaution, I suppose. I just onlined it and got rid of the root volume. The aggregate status (diskroot) was changed to simply online as soon as I did a takeover/giveback routine.

Thanks for the vol_pick_root tip...!

DOMINIC_WYSS
5,887 Views

how did you change the ownership online? the disk reassign command is only working in maintenance mode...

IGORSTOJNOV
5,887 Views

- For shelves coming in from secondary HA pair:

     disk assign <disk list> -s unowned -f

     disk assign <same disk list> -p <pool_id> -s <system_id> -o <nodename>

- For shelf reshuffling within the primary HA pair (being stretched out into MC configuration), switching ownership between the nodes online is indeed not possible. So before shutting down, I removed ownership for the shelves which would be migrated, rewired the nodes/shelves, and than added ownership as needed.

Some notes from reconfiguration process, in case someone:

A) make sure to have all the VM settings (Datastores, disk (scsi) layout: VMDK settings, RDM settings; network settings) noted down

B) when adding a shelf carrying data (pool0), the system will recognize the underlying aggregate as "foreign" and will keep it offline for starters. As soon as you online it, all the LUNs on that aggregate will be preemptively offlined to avoid any LUN ID collision. In effect, they will loose their LUN IDs and mappings. Hence, step A.

C) for aggregate mirroring, node that owns the aggregate (in pool0) must also own the disks for the aggregate mirror (partner side, pool1)

D) aggregate level-0 syncing takes some time - in this case, for 6 mirrored aggregates (24x600GB shelves) it took 36h on 4Gbit connections to complete

DOMINIC_WYSS
5,887 Views

ah, changing to unowned first. thanks for that.

IGORSTOJNOV
5,494 Views

Hello guys,

One last question, post-MC transition...

We have typical pool0/pool1 disk assignment in place, all plexes are fine & doing well... One node has an additional non-mirrored shelf which somehow ended up in that node's pool1. By default rules it should be in pool0 but I'm wondering if should correct this or not.

I mean, if that node node takes over the partner node, the shelf is still locally attached so it should be accessible to it. And if taken-over by the partner node this shelf will, of course, be inaccessible which is OK for us. This sound about right?

On the other hand, if I go for pool reassignment, how should I go about this?


Public