Solved: Re: 8.1RC2 7-Mode to CDOT 8.3rc1, one step?

Chaos · ‎2014-11-19

I have the KB article that discusses moving from 8.1.x 7-Mode to 8.2.y CDOT. But now, 8.3RC1 is available.

Can I go directly from 8.1RC2 7-Mode to CDOT 8.3rc1?

We have lots and lots of small filers. Disk slicing ("root-data partitioning") is a killer feature for us. (And yes, I know that both of these transitions are disk-wipe events.)

shatfield · ‎2014-11-20

You'll have to hop. 8.1->8.1.4, then 8.1.4->8.2.1+. Then do the mode conversion/wipe, then finally 8.2.1+->8.3rc1.

Just make sure you get your 8.2+ feature keys and your cluster base key before you start.

View solution in original post

JSHACHER11 · ‎2014-11-19

8.3 isn't included in the matrix yet but the datasheet does mention 8.1.1 to cDOT 8.3 (one step - not from your version though)

https://fieldportal.netapp.com/7mtt.aspx#276797

the latest tool (1.4) doesn't support 8.3 but I'm sure that 7MTT 2.0 will be/was announced in 'Insight' in Berlin this week

aborzenkov · ‎2014-11-19

Are your 7-mode volumes 32 or 64 bit?

Chaos · ‎2014-11-20

I don't need to preserve any of the data. (Tho to answer the question: all 64-bit.)

shatfield · ‎2014-11-20

You'll have to hop. 8.1->8.1.4, then 8.1.4->8.2.1+. Then do the mode conversion/wipe, then finally 8.2.1+->8.3rc1.

Just make sure you get your 8.2+ feature keys and your cluster base key before you start.

aborzenkov · ‎2014-11-20

I don't need to preserve any of the data.

Then I do not understand the question. You will be wiping filers clean anyway, how does it matter which version was there originally?

shatfield · ‎2014-11-21

There are version checks in the install scripts, so even going the destructive route and running it all from option 7 some of those hops are probably going to be enforced. I put the mode switch after 8.2.1 because cdot 8.1.x is a whole other set of license keys you don't need. You will still have to wipe after getting to 8.3 for partitioning to kick in.

aborzenkov · ‎2014-11-21

There are version checks in the install scripts, so even going the destructive route and running it all from option 7 some of those hops are probably going to be enforced.

Now I became really curious. What can be checked here? The only dependency could be against firmware (LOADER); I had the case when FAS3170 could not netboot 8.2.1. But firmware can be updated from really any version of Data ONTAP, it is done before it loads anyway.

When you netboot and go into 7, you are running exactly the same version that you are going to install so there could be no incompatibility here. Next step will be 4 where you wipe disks clean so there is nothing left to check. Could you explain what do you mean? Not arguing, just trying to learn something new 🙂

Chaos · ‎2014-11-21

Aha, thanks for explaining the rationale behind the hops. I was wondering, because I couldn't derive those requirements from the release notes alone.

shatfield · ‎2014-11-21

NetBoot should bypass it. Install script in the tgz can read the existing version from the boot device.

dankirkwood · ‎2014-11-28

If you don't know the procedure yet for getting Root Disk Partitioning up on a filer that you are *wiping*, here it is:

(This assumes a new install and no data/config that you want to keep).

1. Make sure that disk auto-assign is enabled. (I don't know how to do this from maint mode, but it should hopefully be on by defaiult)

2. Halt both controllers

3. Boot into Maintenance Mode on one controller

4. Remove all the aggregates from all of the disks on the internal shelf.

5b. If you previously had tried disk partitioning on these disks, remove the partitions from maint mode too (there is a new "disk unpartition" command)

6. Remove ownership from ALL of the disks in the internal shelf.

7. Halt and reboot the first node.

8. Access the boot menu (Ctrl+C) and select Option 4.

9. When the node reboots and starts zeroing disks, it will create partitions on the internal shelves and zero them. Half of the disks will be assigned to each of the nodes.

10. As soon as the first node has started zeroing its disks, you can boot the second node and select Option 4.

After zeroing, you should get, on each node, something like:

Nov 25 13:20:45 [localhost:raid.autoPart.start:notice]: System has started auto-partitioning 6 disks.
....Nov 25 13:20:46 [localhost:raid.partition.disk:notice]: Disk partition successful on Disk 0a.00.1 Shelf 0 Bay 1 [NETAPP X487_SLTNG600A10 NA00] S/N [S0M3FK9P0000M507GA26], partitions created 2, partition sizes specified 1, partition spec summary [2]=37660227.
....Nov 25 13:20:47 [localhost:raid.partition.disk:notice]: Disk partition successful on Disk 0a.00.3 Shelf 0 Bay 3 [NETAPP X487_SLTNG600A10 NA00] S/N [S0M3FNFH0000M5075L03], partitions created 2, partition sizes specified 1, partition spec summary [2]=37660227.
....Nov 25 13:20:49 [localhost:raid.partition.disk:notice]: Disk partition successful on Disk 0a.00.5 Shelf 0 Bay 5 [NETAPP X487_SLTNG600A10 NA00] S/N [S0M3FKAS0000M507GA0U], partitions created 2, partition sizes specified 1, partition spec summary [2]=37660227.
....Nov 25 13:20:50 [localhost:raid.partition.disk:notice]: Disk partition successful on Disk 0a.00.7 Shelf 0 Bay 7 [NETAPP X487_SLTNG600A10 NA00] S/N [S0M3FMZ30000M507CL7X], partitions created 2, partition sizes specified 1, partition spec summary [2]=37660227.
....Nov 25 13:20:52 [localhost:raid.partition.disk:notice]: Disk partition successful on Disk 0a.00.9 Shelf 0 Bay 9 [NETAPP X487_SLTNG600A10 NA00] S/N [S0M3FLQW0000M507G9TN], partitions created 2, partition sizes specified 1, partition spec summary [2]=37660227.
....Nov 25 13:20:53 [localhost:raid.partition.disk:notice]: Disk partition successful on Disk 0a.00.11 Shelf 0 Bay 11 [NETAPP X487_SLTNG600A10 NA00] S/N [S0M3FN6D0000M507CJ3J], partitions created 2, partition sizes specified 1, partition spec summary [2]=37660227.
Nov 25 13:20:55 [localhost:raid.autoPart.done:notice]: Successfully auto-partitioned 6 of 6 disks.

Note: this was a system with 12 disks on the internal shelves.

When the disks are zeroed and ONTAP has booted, you'll get the node setup prompts. You will find an active-active configuration with half of the data partitions assigned to one node, and the other half assigned to the other.

If you want to re-assign all of the data partitions to the first node, wait til both nodes are booted, and then you can do something like:

## show the current assignment of root and data partitions:

## (note that the data-owner alternates between nodes)

MYCLUSTER::> disk show -shelf 00 -fields root-owner,data-owner
disk data-owner root-owner
------ -------------- --------------
1.0.0 MYCLUSTER-02 MYCLUSTER-02
1.0.1 MYCLUSTER-01 MYCLUSTER-01
1.0.2 MYCLUSTER-02 MYCLUSTER-02
1.0.3 MYCLUSTER-01 MYCLUSTER-01
1.0.4 MYCLUSTER-02 MYCLUSTER-02
1.0.5 MYCLUSTER-01 MYCLUSTER-01
1.0.6 MYCLUSTER-02 MYCLUSTER-02
1.0.7 MYCLUSTER-01 MYCLUSTER-01
1.0.8 MYCLUSTER-02 MYCLUSTER-02
1.0.9 MYCLUSTER-01 MYCLUSTER-01
1.0.10 MYCLUSTER-02 MYCLUSTER-02
1.0.11 MYCLUSTER-01 MYCLUSTER-01
12 entries were displayed.

## re-assign the data partitions from MYCLUSTER-02 to MYCLUSTER-01:

## LEAVE AT LEAST ONE DISK PER NODE where the DATA partition and the ROOT partition are owned by the same node

## This is required for the system to be able to write core dumps during a panic. It must own the whole disk.

MYCLUSTER::*> disk assign -data -owner MYCLUSTER-01 -force -disk 1.0.0
MYCLUSTER::*> disk assign -data -owner MYCLUSTER-01 -force -disk 1.0.2
MYCLUSTER::*> disk assign -data -owner MYCLUSTER-01 -force -disk 1.0.4
MYCLUSTER::*> disk assign -data -owner MYCLUSTER-01 -force -disk 1.0.6
MYCLUSTER::*> disk assign -data -owner MYCLUSTER-01 -force -disk 1.0.8

## show assignments again

MYCLUSTER::*> disk show -shelf 00 -fields root-owner,data-owner
disk data-owner root-owner
------ -------------- --------------
1.0.0 MYCLUSTER-01 MYCLUSTER-02
1.0.1 MYCLUSTER-01 MYCLUSTER-01
1.0.2 MYCLUSTER-01 MYCLUSTER-02
1.0.3 MYCLUSTER-01 MYCLUSTER-01
1.0.4 MYCLUSTER-01 MYCLUSTER-02
1.0.5 MYCLUSTER-01 MYCLUSTER-01
1.0.6 MYCLUSTER-01 MYCLUSTER-02
1.0.7 MYCLUSTER-01 MYCLUSTER-01
1.0.8 MYCLUSTER-01 MYCLUSTER-02
1.0.9 MYCLUSTER-01 MYCLUSTER-01
1.0.10 MYCLUSTER-02 MYCLUSTER-02 ## Note: leave one disk as spare for Node 2, with data and root partitions owned
1.0.11 MYCLUSTER-01 MYCLUSTER-01 ## Note: leave one disk as spare for Node 1, with data and root partitions owned
12 entries were displayed.

My thanks to Jawwad Memon at NetApp for explaining this procedure to me.

I am also trying to work out how to do it for existing systems without re-building the cluster (it should be possible to partition the internal disks and write a new root aggregate to them one by one) - I'll post this when I have it, and NetApp should also be producing a TR or support article on it.

aborzenkov · ‎2014-11-28

8. Access the boot menu (Ctrl+C) and select Option 4.
9. When the node reboots and starts zeroing disks, it will create partitions on the internal shelves

Can you chose - with or without partitions? Or will it always create partitions on internal disks? The point is, if you want to extend aggregate with external disks you better have the same size; or are all disks right-sized to data partition?

## LEAVE AT LEAST ONE DISK PER NODE where the DATA partition and the ROOT partition are owned by the same node
## This is required for the system to be able to write core dumps during a panic. It must own the whole disk.

Is it documented somwehere? I'd expect cDOT to refuse operation in this case, at least without explicit --force.

dankirkwood · ‎2014-11-29

@aborzenkov wrote:

Can you chose - with or without partitions? Or will it always create partitions on internal disks? The point is, if you want to extend aggregate with external disks you better have the same size; or are all disks right-sized to data partition?
You don't get a prompt for it during initialization. It just does it, as long as it's a FAS2xxx and it's looking at the disks on the internal shelf. There may be a way to override it, but I don't know what it is.

You are correct that the external disks would be right-sized to be as big as the data partitions on the internal shelf, if you add them to them to an aggregate that is composed of data partitions on disks in the internal shelf. Depending on how big your aggregate is, it still may be more efficient - remember, you're "saving" 6 disks by not having to dedicate them as root.

If you want to set up with dedicated root aggregates, you could probably do something like (my guess):

1. Create new root aggrs somewhere as per https://kb.netapp.com/support/index?page=content&id=3013873
2. Delete existing aggrs on the internal shelf
3. Unpartition the didks on the internal shelf from nodeshell or maint mode (see "disk unpartition")
4. Create new root aggrs on the internal shelf per the above KB article.

This is only my guess so use extreme caution.

> ## LEAVE AT LEAST ONE DISK PER NODE where the DATA partition and the ROOT partition are owned by the same node
> ## This is required for the system to be able to write core dumps during a panic. It must own the whole disk.
> Is it documented somwehere? I'd expect cDOT to refuse operation in this case, at least without explicit --force.

It's documented in the cDOT 8.3 Physical Storage Guide. It doesn't seem to be enforced. From a sparing perspective it's not a big problem, as long as each node has at least one spare data partition and one spare root partition. But if it doesn't have a full spare disk (both partitions), it won't be able to write a core dump if there is a panic, according to the Physical Storage Guide. This seems to be the reason for having this requirement.

aborzenkov · ‎2014-11-30

It's documented in the cDOT 8.3 Physical Storage Guide.

Documentation is vague. It says "The size of the partitions ... depends on the number of disks used to compose the root aggregate" but so far root aggregate always consisted of three disks (all disks are zeroed but only three of them are used), so it is quite a big deviation from established practice. How many disks are used now? Also what happens with replacement disks? Is it necessary to manually partition them?

dankirkwood · ‎2014-12-02

I do wish there was more info in the documentation. Hopefully NetApp will make the documentation more detailed and/or publish a TR.

At Insight we were told "everything is done automatically" in terms of handling replacement disks. In practice I *think* that will mean that a replacement disk that's inserted into the internal shelf will be automatically assigned to a node according to the autoassign policy, and it will be partitioned with a root and data partition owned by the same node as the disk that contains them. The empty partitions would then be targets for rebuilds.

dankirkwood · ‎2014-12-02

@aborzenkov wrote:

It says "The size of the partitions ... depends on the number of disks used to compose the root aggregate" but so far root aggregate always consisted of three disks (all disks are zeroed but only three of them are used), so it is quite a big deviation from established practice. How many disks are used now?

If you have 24 disks on the internal shelf, it's my understanding that the system sets itself up as follows:

* 1 root aggr per node, spread across root partitions of 10 disks per node (8D+2P)

* 2x spare disks per node, already partitioned into root & data slices.

On my 12-disk FAS2520, I got 5x root partitions per root aggr, and 1x spare disk per node.

MYCLUSTER::*> aggr show -aggregate aggr0_root_node0 -disk

Aggregate #disks Disks

--------- ------ ---------------------------

aggr0_root_node0

5 1.0.1, 1.0.3, 1.0.5, 1.0.7, 1.0.9

MYCLUSTER::*> aggr show -aggregate aggr0_root_node1 -disk

Aggregate #disks Disks

--------- ------ ---------------------------

aggr0_root_node1

5 1.0.0, 1.0.2, 1.0.4, 1.0.6, 1.0.8

My root aggrs are 368.4GB usable size (aggr show command), so I imagine that when the system makes the partitions, it makes them big enough that it ends up with a root aggr of the desired size.