ONTAP Hardware
ONTAP Hardware
Hi,
I'm newish to NetApp and done loads of the online training videos. Non of them go into the whole root aggrates etc.
So I have a FAS2552 with 2 controllers and 24 drives. 4 SSD and 20 SAS. The aggrates setup (by an external contractor) are the following:
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0_node_01
904.9GB 552.9GB 39% online 1 Node-01 raid_dp,
hybrid_
enabled,
normal
aggr0_node_02
122.8GB 5.95GB 95% online 1 Node-02 raid_dp,
normal
aggr1 11.43TB 11.43TB 0% online 0 Node-01 mixed_raid_
type,
hybrid,
normal
So we have one aggr with a root volume using 39% at 900GB size using 16 drives on node 1 and another aggr on node 2 which is 133GB using 95% and 4 drives. Then another aggr1 which is for the SVM providing 11.43TB with a Flash Pool. Is this the correct way of doing this ?? I simple want one big LUN which will be presented to a ESXi cluster of servers.
Can't find anything so far that explains the principles etc behind this.. I'm reading things like the aggr should be split evenly over the disks etc? Why are the root aggr so uneven in size across the controllers ? I'm assuming these where setup by the software when initialised ?
Many thanks
Ed
from the above output its hard to say how the disk is used to create your existing node-root and data aggr.
One thing for sure.. there is no reason to have hybrid enabled root aggr for node-1
You could use that SSD for your data aggr.
Can you please run the following command and post the output.
::> node run -node * aggr status -r
thanks,
Robin.
Hi,
Thanks for getting back. Here is the output:
Node: ExploreLane-01
Aggregate aggr0_ExploreLane_01 (online, raid_dp, hybrid_enabled) (block checksums)
Plex /aggr0_ExploreLane_01/plex0 (online, normal, active, pool0)
RAID group /aggr0_ExploreLane_01/plex0/rg0 (normal, block checksums)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0b.00.4P2 0b 0 4 SA:B 0 SAS 10000 73561/150653440 73569/150669824
parity 0b.00.6P2 0b 0 6 SA:B 0 SAS 10000 73561/150653440 73569/150669824
data 0b.00.8P2 0b 0 8 SA:B 0 SAS 10000 73561/150653440 73569/150669824
data 0b.00.10P2 0b 0 10 SA:B 0 SAS 10000 73561/150653440 73569/150669824
data 0b.00.12P2 0b 0 12 SA:B 0 SAS 10000 73561/150653440 73569/150669824
data 0b.00.14P2 0b 0 14 SA:B 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.5P2 0a 0 5 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.7P2 0a 0 7 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.9P2 0a 0 9 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.11P2 0a 0 11 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.13P2 0a 0 13 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.15P2 0a 0 15 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.17P2 0a 0 17 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.19P2 0a 0 19 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.21P2 0a 0 21 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0a.00.23P2 0a 0 23 SA:A 0 SAS 10000 73561/150653440 73569/150669824
Aggregate aggr1 (online, mixed_raid_type, hybrid) (block checksums)
Plex /aggr1/plex0 (online, normal, active, pool0)
RAID group /aggr1/plex0/rg0 (normal, block checksums, raid_dp)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0b.00.16P1 0b 0 16 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
parity 0b.00.20P1 0b 0 20 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.22P1 0b 0 22 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.5P1 0a 0 5 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.7P1 0a 0 7 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.8P1 0b 0 8 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.9P1 0a 0 9 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.10P1 0b 0 10 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.11P1 0a 0 11 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.12P1 0b 0 12 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.13P1 0a 0 13 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.14P1 0b 0 14 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.15P1 0a 0 15 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.17P1 0a 0 17 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.19P1 0a 0 19 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.21P1 0a 0 21 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0a.00.23P1 0a 0 23 SA:A 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.6P1 0b 0 6 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
data 0b.00.4P1 0b 0 4 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192
RAID group /aggr1/plex0/rg1 (normal, block checksums, raid4)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
parity 0b.00.0 0b 0 0 SA:B 0 SSD N/A 190532/390209536 190782/390721968
data 0b.00.2 0b 0 2 SA:B 0 SSD N/A 190532/390209536 190782/390721968
data 0a.00.1 0a 0 1 SA:A 0 SSD N/A 190532/390209536 190782/390721968
Pool1 spare disks (empty)
Pool0 spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block checksum
spare 0b.00.18P1 0b 0 18 SA:B 0 SAS 10000 783402/1604407808 783410/1604424192 (not zeroed)
spare 0a.00.3 0a 0 3 SA:A 0 SSD N/A 190532/390209536 190782/390721968
Partner disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
partner 0b.00.18P2 0b 0 18 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0b.00.16P2 0b 0 16 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0b.00.20P2 0b 0 20 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0b.00.22P2 0b 0 22 SA:B 0 SAS 10000 0/0 73569/150669824
Node: ExploreLane-02
Aggregate aggr0_ExploreLane_02 (online, raid_dp) (block checksums)
Plex /aggr0_ExploreLane_02/plex0 (online, normal, active, pool0)
RAID group /aggr0_ExploreLane_02/plex0/rg0 (normal, block checksums)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0b.00.16P2 0b 0 16 SA:A 0 SAS 10000 73561/150653440 73569/150669824
parity 0b.00.18P2 0b 0 18 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0b.00.20P2 0b 0 20 SA:A 0 SAS 10000 73561/150653440 73569/150669824
data 0b.00.22P2 0b 0 22 SA:A 0 SAS 10000 73561/150653440 73569/150669824
Pool1 spare disks (empty)
Pool0 spare disks (empty)
Partner disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
partner 0a.00.5P2 0a 0 5 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.23P2 0a 0 23 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.19P2 0a 0 19 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.15P2 0a 0 15 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.17P2 0a 0 17 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0b.00.14P2 0b 0 14 SA:A 0 SAS 10000 0/0 73569/150669824
partner 0b.00.12P2 0b 0 12 SA:A 0 SAS 10000 0/0 73569/150669824
partner 0a.00.13P2 0a 0 13 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.11P2 0a 0 11 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0b.00.10P2 0b 0 10 SA:A 0 SAS 10000 0/0 73569/150669824
partner 0a.00.21P2 0a 0 21 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.9P2 0a 0 9 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.7P2 0a 0 7 SA:B 0 SAS 10000 0/0 73569/150669824
partner 0a.00.23P1 0a 0 23 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.20P1 0b 0 20 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.17P1 0a 0 17 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.12P1 0b 0 12 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.22P1 0b 0 22 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.14P1 0b 0 14 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.5P1 0a 0 5 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.16P1 0b 0 16 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.13P1 0a 0 13 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.8P1 0b 0 8 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.9P1 0a 0 9 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.11P1 0a 0 11 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.19P1 0a 0 19 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.15P1 0a 0 15 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.10P1 0b 0 10 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.4P1 0b 0 4 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.6P1 0b 0 6 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.7P1 0a 0 7 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0a.00.1 0a 0 1 SA:B 0 SSD N/A 0/0 190782/390721968
partner 0a.00.3 0a 0 3 SA:B 0 SSD N/A 0/0 190782/390721968
partner 0a.00.21P1 0a 0 21 SA:B 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.18P1 0b 0 18 SA:A 0 SAS 10000 0/0 783410/1604424192
partner 0b.00.0 0b 0 0 SA:A 0 SSD N/A 0/0 190782/390721968
partner 0b.00.2 0b 0 2 SA:A 0 SSD N/A 0/0 190782/390721968
partner 0b.00.8P2 0b 0 8 SA:A 0 SAS 10000 0/0 73569/150669824
partner 0b.00.6P2 0b 0 6 SA:A 0 SAS 10000 0/0 73569/150669824
partner 0b.00.4P2 0b 0 4 SA:A 0 SAS 10000 0/0 73569/150669824
I was wondering what the maximum the the node need for the root aggr and is it easy to change them ? The NetApp isn't in a live envrioment yet so this is why I'm checking now. Just what to make sure it setup correctly. One of the root aggr is almost at capacity too....
Thanks
Ed
In your case, your 900G hdd is partitioned to two P1 (Larger Data Aggr) And P2 (Smaller Node Root Aggr)
this is usually done automatically during the initialization and looks good.
Data Aggr is created on node-1 using 19 of the P1 Disk Partitions to have the maximum space.
And it have one spare.
3 of the SSD is used as raid-4 (one parity and 2 for data) 4th one is spare. (there is not better way to do this..)
Here is the part I don't agree...
Node-1 root aggr is created with 16 of the P2 disk Partition, And no spares (not a recommended configuration)
Node-2 root aggr is created with 4 of the P2 Disk Partition, And no spares (not a recommended configuration)
Couple thing to notice is.. In this configuration majority of your resources on node-2 is sitting idle.
No Spare disk for root aggr.
Based on documentation, minimum root volume size is 350GB, So creating them equally on both node is the recommended practice.
Which mean assign 10 of P2 disk partitions to node-1 and Node-2 and use 9 of them on each node to create root aggr (1 spare on each node)
Creating one large data aggr or making use of your 2nd Nodes CPU resources by splitting the Aggr to two
should be based on your current and future requirements.
Its not easy and Its not possible to make those changes without re-confiuring the cluster.
I'll recommend you get it right before start using it.
Robin.
+1 for robin's comments on the P2 partitions not being balanced across node 1 and node 2.
Q1: What ONTAP version are you running?
Q2: Do you plan to run in an active-passive configuration or do you want both nodes to normally serve data from their own aggregates (active-active)?
Q3: Are you in a position to destroy the data on the system and reinitialize? It might be the fastest path to configuring the system according to best practices.
Hi,
Thanks both for the reply.
Its the latest 91.P5 version.
Currently in testing phase so this is the time I can reset the system.
Are there any docs on how to do this reset ?
So to understand I need to create:
Root Aggr - Node 1 using 9 disks and 1 hotspare
Root Aggr - Node 2 using 9 disks and 1 hotspare
Data aggr - I can split or not and use wither 19 disks and 1 hotspare or split and do 9 + 1 on both .. or some other flavour.
THis is using the same 20 disks for this.
Thanks
Ed
If you're willing to consider 9.2 (GA), there's a nifty new boot menu option for re-partitioning systems with "one click".
From the release notes: https://library.netapp.com/ecm/ecm_download_file/ECMLP2492508
Beginning with ONTAP 9.2, a new root-data partitioning option is available from the Boot Menu that provides additional management features for disks that are configured for root-data partitioning.
If you want to stick with 9.1P5, this KB article takes you through reinitialization process.
Note: The KB works well for FAS (non-AFF), too. You'll just end up with root-data, not root-data-data, partitions. When it's done, each node will "own" half of the 20 disks from a container perspective, and the smaller P2 root partitions will be used to build the root aggregates (and leave 1 spare). The large P1 data partitions will be "spare" for creating data aggregates from either node, as you see fit.
Many thanks - I think we will go for a full initalisation. It sounds cleaner. Can the data aggr still be across all disks ? My worry is the amount of space loss from all the splitting up.
Root - Data vs Root - Data - Data partitions ?? Whats the difference ? I might have read about this but can't quite remember.
Also in a SVM aggr vol. Should the LUN be in a Qtree directory as best practise IE /vol//VM/LUN0over that of /VM/LUN0?
Many thanks
Ed
It depends on whether you want to have both nodes serving data (active-active) or you're happy relegating the 2nd node to a passive "backup" role.
If active-active, you'd have one or more aggregates configured on each node. This will have a bit more "parity disk partition" tax than using all data partitions for one big aggregate, but you have more data serving performance with both nodes active.
I'd look at these docs...
Root-data partitioning concept:
Manually assigning ownership of partitioned disks
Setting up an active-passive configuration on nodes using root-data partitioning
I can't speak to your LUN question... but I wouldn't think it matters. See section 5.3 here:
TR-4080: Best Practices for Scalable SAN