Hi,
I've been testing some root-data-data partitioning setups with Ontap 9.1 and wanted to share my results with a few comments and also to get some feedback from the community.
First test scenario - Full system initialization with 1 disk shelf (ID: 0)
The system splits the ownership of the drives and partitions evenly amongst the 2 nodes in the HA pair.
Between disks 0 - 11, partitions 1 and 3 are assigned to node 1 and partition 2 is assigned to node 2
Between disks 12 - 23, partitions 2 and 3 are assigned to node 2 and partition 1 is assigned to node 1.
Creating a new aggregate on Node 1 with Raid Group size 23 will give me:
- RG 0 - 21 x Data and 2 x Parity
- 1 x data spare
One root partition is roughly 55GB on a 3.8TB SSD
Benefits of this setup:
- root and data aggregates spread their load amongst both nodes
Cons:
- Single disk failure affects both node data aggregates.
It could possibly be better to re-assign partition ownership so that disk ID 0 - 11 are owned by node 1 and 12 - 23 are owned by node 2 ?
Second test scenario - Full system initialization with 2 disk shelves (ID: 0 and ID:10) - Example 1
The system splits the ownership of the drives between both shelves with the following assignments:
Node 1 owns all disks and partitions (0 - 23) in shelf 1
Node 2 owns all disks and partitions (0 - 23) in shelf 2
Creating a new aggregate on Node 1 with Raid Group size 23 will give me:
- RG 0 - 21 x Data and 2 x Parity
- RG 1 - 21 x Data and 2 x Parity
- 2 x data spares
One root partition is roughly 22GB on a 3.8TB SSD
The maximum amount of partitioned disks you can have in a system is 48, so with the 2 shelves, we are at maximum capacity for partitioned disks. For the next shelf, we will need to utilize the full disk size in new aggregates.
Benefits of this setup I see:
- in the case of 1 disk failure or a shelf failure, only 1 node/aggregate would be affected.
Cons of this setup:
- A single node root and data aggregate workload is pinned to 1 shelf
It's possible to reassign disks so that 1 partition is owned by the partner node which will allow you to split the aggregate workload between shelves, however in the case of a disk or shelf failure both aggregates would be affected.
Third test scenario - Full system initialization with 2 disk shelves (ID: 0 and ID:10) - Example 2
In this example, I re-initialized the system with only 1 disk shelf connected.
Disk auto assignment was as follows:
- between shelf 1 disks 0 - 11, partition 1 and 3 are assigned to Node 1 and partition 2 is assigned to Node 2
- between shelf 1 disks 12 - 23, partition 2 and 3 are assigned to Node 2 and partition 1 is assigned to Node 1
I then completed the cluster setup wizard and connected the 2nd disk shelf.
The system split the disk ownership up for shelf 2 in the following way:
- Disks 0 - 11 owned by node 1
- Disks 12 - 23 owned by node 2
Next, I proceeded to add disks 0 - 11 to the node 1 root aggregate and disks 12 - 23 to the node 2 root aggregate. This partitioned the disks and assigned ownership of the partitions the same as shelf 1.
Because the system was initialized with only 1 shelf connected, it created the root partition size as 55GB as opposed to 22GB in my second test scenario above. What this means is that a 55GB root partition is used across the whole 2 shelves as opposed to 22GB. How much space do you actually save when using 3.8TB SSD's:
55GB x 42 (Data disks) = 2,310GB
22GB x 42 (Data disks) = 924GB
Difference = 1, 386GB or 40%
Benefits of this setup:
- Load distribution amongst shelf 1 and 2
Cons:
- Larger root partition
- single disk or shelf failure affects both aggregates
Fourth test scenario - Full system initialization with 2 disk shelves (ID: 0 and ID:10) - Example 3
Following on from my thrid test scenario, I re-assign the partitions so that partitions on disk id: 0 - 11 are owned by node 1 and 12 - 23 are owned by node 2
Benefits of this setup:
- 1 disk failure only affects 1 node root and data aggregate
- equal load distribution amongst the shelves.
Cons:
- Larger root partition
- 1 shelf failure will affect both nodes
Interested to hear feedback on the above setups, which ones do you prefer and why ?
Also feel free to add additional comments or setups that are not listed above.