ONTAP Hardware

Raid group size recommendation

janjacobbakker
63,189 Views

Hi Folks,

I'm trying to design a storage solution.

A FAS3020 with 3 shelves full with 42x 300GB FC disks.

Default the Raid Group size is 16 (14 + 2 spare) de max raid group size is 28.

Does anyone have some best practice information? The NOW site hasn't much info on that.

Kind regards

71 REPLIES 71

kusek
44,682 Views

Hello Jan-Jacob,

That's an excellent question, hopefully I can shed some light on it here.

Being that you're using a 3020 and 300GB FC disks, in order to meet the maximum capacity of a single aggregate, your ideal RG size would be 15

With these particular disks (300GB FC disks) you can have a maximum of 59 disks allocated to the aggregate, leaving you with room to grow into the final drives to fill out this aggregate. While a RG size of 16 would also work, you'll get the best performance and space utilization by establishing an RG of 15.

I hope this was able to address your question and help you complete your storage design.

Thanks,

Christopher

janjacobbakker
44,685 Views

Christopher

Thanks for you quick response! I'm now building the 42disk aggregate with a 15 disk RG's

Regards,

Jan-Jacob

mbeijnen1
44,837 Views

Dear Christopher,

Just curious to know how you came up with your answer... based on the 42 disks that are availabe to Jan-Jacob, wouldn't it be more advisable to choose RG size of 14 disks to have them even more equally divided?

Suppose 1 disk is used as hotspare, your offered solution (RG size of 15 disks) yields:

2x (13 spindels + 2 parity disks) +

1x (9 spindels + 2 parity disks) +

1 hotspare

A 14 disks RG size yields:

2x (12 spindels + 2 parity disks) +

1x (11 spindels + 2 parity disks) +

1 hotspare

Since performance is limited by slowest set, the equally devided solution would give a better performance.

Can you please provide the reasoning behind your answer, hopefully increasing my insight in the NetApp setup 🙂

kusek
44,837 Views

Great question Mark, I'll try to do your question justice!

I was looking at this from two aspects: Performance, and long-term capacity.

While the system does indeed have 42 disks today, tomorrow it may have a need for additional capacity.

So, by choosing a 15disk raid-group, I'm assuring myself not only maximum efficient RG design, I'm also committing to the maximum amount of space.

Also, with his third disk-set sitting at 9 disks (7), it is usually seen that your smallest RG need be atleast half the size of the RG size itsel, by having 7(9) disks there, we meet that criteria. (Although, being that Jan-Jacob did not require being capacity bound, that 3rd plex need not be added until necessary)

Coming back to other possible options, based upon todays available disk.

If I'm guaranteed a maximum amount of disks (42) in my system and never expect to grow it, then a 14 disk RG could indeed work.

But as storage is always growing, tomorrow comes around and I add another 2 shelves to my system, and when I try to grow it based upon a 14 disk RG I would end up with 56 disks allocated to the aggregate.

A 56 disk aggr isn't bad, it fulfills the criteria of performance by giving me appropriate spindles and also provides me a sizable amount of space - but my maximum capacity (bound by the RG) is stuck at 1TB less than the maximum availble to me in a 59 disk aggr (fulfilled by the 15 disk RG). With that in mind (I considered it) I opt'd to go for the best practice for best performance while also being able to fulfill maximum capacity in the long-run.

My sources for this were a calculator and capacity documents.

Hopefully this helped bring some insight into the operation and my decisions around it.

Thanks Mark,

Christopher

mbeijnen1
44,838 Views

Dear Christopher,

Thank you very much for your reply, explaining the reasoning behind your recommendation.

Based on the reply I start running into more fundamental questions of sizing myself - probably a reference to the capacity documents mentioned in your post could help me with that.

When trying to calculate the maximum aggregate size I am still having a hard time deriving the 59 disks maximum that you mentioned. Once I get that I can understand your recommended RGsize of 15 disks since that would optimize capacity/performance for 59 disks in total, effectively having 4 raidgroups of (almost) equal size.

Some of the questions I find myself asking now:

- the 16 TB limit for an aggregate seems to include all disks (raw capacity), including parity disks?

- when calculating optimized sizes what GB's should be used, 1000/1024 based?

We are currently in the process of expanding a NetApp storage system and are looking into optimized setup, both now and in the future - yes, I tend to agree that we will in the end grow to the maximum size as well 🙂

Would it be possible to provide some pointers in sizing/capacity documentation to get us started? Also, a brief explanation on how you derived the 59 disks maximum for a 300 GB disk size would be much appreciated.

Kind regards,

Mark.

kusek
44,838 Views

As promised - Some details on disk max sizes and performance considerations!

Maximum data drives per 16-TB aggregate

With the aggregate size calculation changes present in Data ONTAP 7.3, you can include more data drives in an aggregate without exceeding the aggregate size limit.

The following table shows the maximum number of data drives that can be included in a 16-TB aggregate for Data ONTAP 7.3 and for previous releases.

Drive sizeDrive typeData ONTAP 7.2 and earlierData ONTAP 7.3
36 GBFC427493
72 GBFC212246
144 GBFC/SAS106123
300 GBFC/SAS5161
250 GBSATA6879
320 GBSATA5361
500 GBSATA3339
750 GBSATA2226
1 TBSATA1519

Hopefully this helps with its reference to documents in the notes.

Also, if there are ever any questions - it never hurts to ask and validate your concerns against best practices and what is being actively used.

Take care,

Christopher

chriskranz
44,838 Views

This table is pretty confusing as it doesn't show the parity disks at all.

The SATA numbers also don't add up, and I've never understood fully the SATA aggregate limits...

68x 250 = 17000

53x 320 = 16960

33x 500 = 16500

22x 750 = 16500

15x 1000 = 15000

Obviously all except the TB disks are well over the aggregate limit.

With FC disks are worked out on RAW including the parity disks, but SATA it doesn't seem so...


Back to the topic a little more. Are there any stats to show the performance improvements by having similar RAID group sizes across an entire aggregate? It'd be useful to backup the reasoning for changing the RAID group defaults.

kevin_graham
44,683 Views

This table is pretty confusing as it doesn't show the parity disks at all.

That's because with 7.3, parity disks are no longer relevant to the max aggr size. That table is the maximum number of data disks per aggr.

Back to the topic a little more. Are there any stats to show the performance improvements by having similar RAID group sizes across an entire aggregate? It'd be useful to backup the reasoning for changing the RAID group defaults.

I don't recall seeing anything published, but the key would just be ensure uniform performance across the entire aggr.

kusek
44,685 Views

Kevin, you hit it right on the head - regarding 7.3

During the original post I had promised to try to provide as much citeable evidence as possible as to why certain RG's would be more ideal than another (outside of the default best practice) in order to achieve maximum space while providing for maximum performance. Every bit of public collateral I can find (such as that 7.3 based table) I wanted to ensure was available and in a consumable format.

As for the performance of similar raid group sizes also, best practice dictates that RG's should have at a minimum half as many disks as the RG size (So for a 14 disk RG you're looking at 7 disks minimum preferable) in order to avoid running into any kind of hot-disk scenarios and an immediate need to reallocate

Ofcourse as always, I prefer citable evidence as to the impacts, so until that point I'll continue setting preference for my min-half..Full philosophy - until I hear otherwise. 🙂

Thanks for your input and feedback on this Kevin!

Christopher

chriskranz
20,272 Views

kevin.graham wrote:

That's because with 7.3, parity disks are no longer relevant to the max aggr size. That table is the maximum number of data disks per aggr.

That's fair enough, but because the thread is about RAID group sizes, the table is still a little confusing as it lays it out for standard RAID group sizes. As you say, the parity disks get dropped for 7.3 so it is much more relevant then (however I still don't understand the SATA calculations!).
It makes sense though to have a new table that defines the optimal RAID group sizes for each disk type as obviously this is variable. Something along the lines of the following (this was just a quick calculation from me, so could be improved)...
Drive sizeDrive typeData ONTAP 7.2 and earlierData ONTAP 7.3
Spindle CountNumber of Parity DisksOptimal RAID Group SizeResultant RAID GroupsSpindle CountNumber of Parity DisksOptimal RAID Group SizeResultant RAID Groups
36 GBFC427661532.87493661732.88
72 GBFC212281615.00246421420.57
144 GBFC/SAS10616167.6312318168.81
300 GBFC/SAS518153.936110154.73
250 GBSATA6810164.887912165.69
320 GBSATA5310134.856110154.73
500 GBSATA336133.00396153.00
750 GBSATA224132.00264152.00
1 TBSATA152171.00194121.92

chriskranz
20,272 Views

Sorry, the table gets a bit messed up as a post, I'll attach the excel spreadsheet instead...

kusek
20,272 Views

I like the consolidated table Chris,

The only modifications I would look at are changing the 320gb SATA optimal RG in a Pre-7.3 to 15 disks and the Optimal RG in Pre-73 for 144gb disks to 15 as well

Those would allow for the maximum capacity, and along the same token spread-iops as well.

Curious how you got some of the numbers for your tables (Like the spindle count column) because it doesn't entirely match up with what I'm working with - Which slightly skews the results. Also you need to account for different maximum capacity sizes of 15k disks over 10k disks. (which at the smaller disk sizes makes it less of an apples to apples comparison) Otherwise the table itself seems pretty nice!

Thanks for your attention on this Chris!

Christopher

chriskranz
20,272 Views

Yeah sorry the table isn't perfect, I just took the one posted before and quickly did some maths. I reckon you could build some fancy excel table to do it a lot better. And you're right, I definitely didn't take into account any limitations, just took the table from before.

It would be nice to see something official like this from NetApp, just a quick cheat sheet.

kusek
20,273 Views

Chris,

I personally am trying to make sure the collateral like that link and table I referenced before is able to be found easier.

I know hte data is out there, it's just a matter of pulling away the tangles and the brush to find it.

I'll keep you and everyone informed!

Christopher

amiller_1
18,277 Views

Hmm....the table makes complete sense but I must admit I'm a bit confused. I'm looking at FilerView on a 3050 right now where I setup a 39 disk aggergate of 500 GB drives on 7.2.2 with total usable capacity of 12 TB.

If I understand the discussion correctly, putting (39) 500 GB disks is what I can now do under 7.3....but I did it under 7.2.2.

I'd originally thought that 7.3. adjusted aggregate maximum sizing so the maximum was calculated against usable space rather raw space....meaning I could put more like ~45 500 GB disks into a single aggregate (say 3 (14) 500 GB RAID groups in a single aggregate).

Thoughts? Thanks.

emansourtpg
18,257 Views

is there any reason to make a single aggregage instead of multiple. aggregates. If let's say the raid group size is 16

so what's the difference if you have 32 disks in 1 aggr or 2 x 16 in 2 x aggr's ?

amiller_1
15,695 Views

Performance.

If you have an aggregate with 32 disks (albeit in (2) 16 disk RAID groups), your writes (and possibly reads) will be spread across many more spindles.

Not using large aggregates actually defeats one of the primary NetApp benefits and gets you more back storage management a la EMC.

Having large pools of disk with good performance visibility (statit or Performance Advisor....goes down to the aggregate/volume/disk) as well as prioritization (FlexShare) is a good bit of the NetApp magic sauce.

emansourtpg
15,695 Views

thank you for the reply. I guess I still don't understand.

if you have 2 x 16 disks raid group in 1 aggr . that means each write will only use 16 disks not the full 32 of the aggr.

what am i missing ?

amiller_1
16,371 Views

If you just have a single write, sure (a single write will actually just go to a single disk not even a single RAID group). But....all writes get cached in NVRAM and then flushed to disk (i.e. as many writes as all the disks can handle). So....the more disks you have (even if in multiple RAID groups), the faster your writes are as they do get spread across all the disks.

Similarly, there can be more reads as well for the same reason essentially.

mheimberg
15,821 Views

kusek wrote:

So, by choosing a 15disk raid-group, I'm assuring myself not only maximum efficient RG design, I'm also committing to the maximum amount of space.

Also, with his third disk-set sitting at 9 disks (7), it is usually seen that your smallest RG need be atleast half the size of the  RG size itsel, by having 7(9) disks there, we meet that criteria.

For the sake of completeness: I think no one has asked yet for a RG size of 28 (or 27).

Today there are 3 shelves, setting rg size to 27 one could form one large RG0 on the first 2 shelves (27 disks, one spare), the 3rd shelf would form RG1 with 13 disks, 1 spare.

This would gain maximum capacity avoiding 2 parity disks, performance should be well (RG1 is half of the size of RG0 resp. rg size).

Donwside: reconstruction takes longer, what else?

When expanding with a 4th shelf it is added to RG1, thus getting an aggregate formed by 54 disks (2x27), which is beyond disk/aggregate size limit.

Missed something?

Mark

Public