ONTAP Hardware

Raid group size recommendation

janjacobbakker
62,448 Views

Hi Folks,

I'm trying to design a storage solution.

A FAS3020 with 3 shelves full with 42x 300GB FC disks.

Default the Raid Group size is 16 (14 + 2 spare) de max raid group size is 28.

Does anyone have some best practice information? The NOW site hasn't much info on that.

Kind regards

71 REPLIES 71

__WillGreen_13030
12,789 Views

I am considering using a 2050 with ONTAP 7.3 on a project. I am thinking of having 20 x internal 300GB SAS disks + a tray of 14 x 300GB FC disks. This gives me a total of 34 x 300GB disks.

My questions concern how these should these be divided. Do I want a small aggregate for root and one aggregate of two RAID groups for my main storage? Does the fact that some of the disks are SAS and some FC make any difference? How many disks should be hot standbys? What would the estimated usable space of the configuration be (without a reserve for snapshots)?

Finally, are there any size limitations on an active-active configuration? The clusterd.pdf (linked to from this page on netapp.com) refers to size limitations on clusters, but the models it talks about seem to be old.

Apologies for tacking this question on to this existing thread, but it seemed pretty close to my situation in size, if a little different in hardware.

amiller_1
12,789 Views

There aren't any aggregate size limitations on the 2050 (only on the 2020 pre-7.3.1). But.....the 2050 with internal disk is a bit of an odd beast....multiple factors to consider here.

-You can put FC & SAS disk into the same aggregate (and raid group even)....but if you do so, you can't put the FC tray onto another filer easily (i.e. future upgrade).

-Although FC & SAS can go into the same aggregate/raid group, the spares are separate.

-I generally like having 2 hot spares of each kind of disk type as that enables "Disk Maintenance Center".

So....what I'd probably do with a FAS2050A (i.e. active-active controllers) would be....

-1 2050 head controlling the external FC tray (i.e. 14 disks) via software disk ownership

-- 12 disk aggregate with one RAID group for 2.26 TB usable (no snapshot reserve, aggr snap reserve at 3%)

-- 2 spares

-- root volume as a 20 GB FlexVol

-- rest of space for whichever volumes you like

-1 2050 head controlling the internal SAS drives (i.e. 20 disks) via software ownership

-- 18 disk aggregate with one RAID group for 3.62 TB usable (no snapshot reserve, aggr snap reserve at 3%)

-- 2 spares

-- root volume as a 20 GB FlexVol

-- rest of space for whichever volumes you like

This gets you the ability to move the external FC to another filer head easily in the future as well as maximizes spaces/speed by keeping the # of aggregates to the minimum (more spindles = good).

Since you'll only have one external FC loop, you'll have a spare FC port on each 2050 (presuming no FC clients) so could even do MPHA as well. That clusterd.pdf is incredibly old....stops at the FAS9xx line which was dropped 3 years back IIRC.

And....general forum etiquette would be that sticking this in a new thread would be best. (You could put a short reply in this thread with a link asking people to go check out your new thread for instance.)

Message was edited by: Andrew Miller

__WillGreen_13030
12,789 Views

Thanks for the very fast reply. Is MPHA multi-path high-availability? Each FC disk connected by two loops?

Edit: I've just found http://partners.netapp.com/go/techontap/matl/storage_resiliency.html which explains this.

PS. I was torn between starting a new thread and adding to this one. In the end I chose here because I thought people wouldn't want to have hundreds of similar RAID group size questions on the forums. I'll be sure to start a new thread in future.

amiller_1
12,789 Views

Thanks for the very fast reply. Is MPHA multi-path high-availability? Each FC disk connected by two loops?

Edit: I've just found http://partners.netapp.com/go/techontap/matl/storage_resiliency.html which explains this.

Quite welcome....and precisely right (as you found). I just really like MPHA for anything past a very small configuration....doesn't cost all that much if the FC ports are available. That article also confirms the 2 disk spare thing as well.

PS. I was torn between starting a new thread and adding to this one. In the end I chose here because I thought people wouldn't want to have hundreds of similar RAID group size questions on the forums. I'll be sure to start a new thread in future.

Not really an exact science...just what I've picked up over the last 10 years or so online. It also makes it easier for people searching to find specific pieces when they're split out logically into separate threads (rather than a huge monster thread).

And....I don't know that my configuration is what everyone would do (although I like it ;). I'm curious to hear how others might set it up.

__WillGreen_13030
12,789 Views

It dawned on me today while considering designs that you had created two aggregates. Is there a particular reason for this? I would have thought it would be simpler and more flexible to have one. Is it because with two controllers I need two aggregates to use both, or is it that mixing the internal SAS and FC disks in one aggregate would reduce performance?

Thanks again,

Will

amiller_1
12,789 Views

Quite intentional because....

-you have to have 2 RAID groups at least (since FC/SAS max raid group size is 28 disk). They could be in the same aggregate but....

-if you have a 2050A (i.e. active-active cluster), you need two aggregates (one for each head)

-if you just have a 2050 (single head), having 2 aggregates makes it easier to upgrade it to a 2050A later

-by keeping the aggregates separate between the SAS and the FC disk, it makes it easy in the future to move the FC-based aggregate to another storage head (since with NetApp people often keep disk for 6+ years the disk usually lives through a head upgrade or two).

If you just have a single 2050, you could do one large aggregate.....but you'd have to add disk if you made it a 2050A later (or destroy the aggregate to split it up)

__WillGreen_13030
12,789 Views

Very clear answer.

Thanks again,

Will

arthursc0
9,879 Views

I appreciate all the answers posted here. However, without sounding like an idoit in need of a guide, surely there must be a calculator that you can use to size your aggr.

If someone has a spreadsheet that will automatically calculate rg size etc that would be great.

All I want to do is aviod getting myself into the position where I have a rg that has 3 disks in it (1p, 1dp and 1 data).

Regards

Colin.

arthursc0
10,075 Views

Ok so how would I chop up this;

Filer only has 500GB sata drive in it.

I have 50 Spares

As far as I can see i would create an aggr of 39 disk (max for sata)

aggr create aggrname 39

how would or what command would I use to specify the rg size or should I let ontap define this?

regards

Colin.

radek_kubka
10,353 Views

The default RAID group size for SATA is 14, so not optimal in your case.

Actually Scott already provided CLI syntax for custom RG size in this thread few posts above:

http://communities.netapp.com/message/4148#4148

Regards,
Radek

redtail
10,353 Views

As you are creating 3 x 13 Raid Group for your aggr of 39 Disks:

You will need to :

aggr create aggrname 13 (For your first raid group and aggr creation)

aggr add -g new aggrname 13 (To add additional Raid Groups with size of 13 Disk to the same aggr created above, repeat for 3rd Raid Group)

Regards,

Hong Wei

infinitiguy
10,545 Views

are there any documents that talk about raid group/size with 64-bit aggregates and ontap 8?

We have a pair of 3160's running 7.3.4 that I'm planning on upgrading to ontap 8 and using 64-bit aggregates.  There is nothing on these filers now so I have no data to maintain or worry about during the upgrade process.

We have 5 shelves x 14 disks of 750gb sata and 3 shelves of 1tb sata.  We're going to grow the 5 shelves to 6 and the 3 shelves to 6 throughout this upgrade process (once data starts moving). 

What kind of limits will I be looking at, and with the much higher aggregate ceiling, what does that do to recommended/suggested raid groups.

Also, I'm pretty new to netapp in terms of doing disk layout/aggregate creation.  Do the raidgroups determine parity disk count (each raid group has 2 parity)?  What about spare disks, can they be added to any raidgroup in the loop?

jwhite
10,546 Views

Hello,

There has been a lot of work put into aggregate and RG sizing recommendations at NetApp.  The documents that cover this information are currently NDA --- if you are covered by an NDA with NetApp you can request these documents from your account team (reference TR3838, the Storage Subsystem Configuration Guide or the Storage Subsystem Technical FAQ --- both cover the RAID group sizing policy).  For those who are not covered by NDA with NetApp --- since the RG sizing policy itself is not confidential I will paste the text below (from the FAQ --- hence the question/answer format).  This is the official NetApp position on RG sizing.

<policy>

SHOULD ALL AGGREGATES ALWAYS USE THE DEFAULT RAID GROUP SIZE?

The previous approach to RAID group and aggregate sizing was to use the default RAID group size. This no longer applies, because the breadth of storage configurations being addressed by NetApp products is more comprehensive than when the original sizing approach was determined. Sizing was also not such a big problem with only 32-bit aggregates, which are limited to 16TB. You can fit only so many drives and RAID groups into 16TB. The introduction of 64-bit aggregates delivers the capability for aggregates to contain a great number of drives and many more RAID groups than was possible before. This compounds the opportunity for future expansion as new versions of Data ONTAP® support larger and larger aggregates.

Aggregates do a very good job of masking the traditional performance implications that are associated with RAID group size. The primary point of this policy is not bound to performance concerns but rather to establishing a consistent approach to aggregate and RAID group sizing that:

  • Facilitates ease of aggregate and RAID group expansion
  • Establishes consistency across the RAID groups in the aggregate
  • Reduces parity tax to help maximize “usable” storage
  • Reduces CPU overhead associated with implementing additional RAID groups that might not be necessary
  • Considers both the time it takes to complete corrective actions and how that relates to actual reliability data available for the drives

These recommendations apply to aggregate and RAID group sizing for RAID-DP®. RAID-DP is the recommended RAID type to use for all NetApp storage configurations. In Data ONTAP 8.0.1, the maximum SATA RAID group size for RAID-DP has increased from 16 to 20.

For HDD (SATA, FC, and SAS) the recommended sizing approach is to establish a RAID group size that is within the range of 12 (10+2) to 20 (18+2); that achieves an even RAID group layout (all RAID groups contain the same number of drives). If multiple RAID group sizes achieve an even RAID group layout, NetApp recommends using the higher RAID group size value within the range. If drive deficiencies are unavoidable, as is sometimes the case, NetApp recommends that the aggregate should not be deficient by more than a number of drives equal to one less than the number of RAID groups. Otherwise you would just pick the next lowest RAID group size. Drive deficiencies should be distributed across RAID groups so that no single RAID group is deficient more than a single drive.

Given the added reliability of SAS and Fibre Channel (FC) drives, it might sometimes be justified to use a RAID group size that is as large as 24 (22+2) if this aligns better with physical drive count and storage shelf layout.

SSD is slightly different. The default RAID group size for SSD is 23 (21+2), and the maximum size is 28. For SSD aggregates and RAID groups, NetApp recommends using the largest RAID group size in the range of 20 (18+2) to 28 (26+2) that affords the most even RAID group layout, as with the HDD sizing approach.

</policy>

In addition to the policy, we publish tables for maximum size aggregates (which are also not confidential but are in the NDA docs):

<64-bit aggregates>

HOW MANY DRIVES CAN BE USED IN A MAXIMUM SIZE 64-BIT AGGREGATE?

64-bit aggregates are supported with Data ONTAP 8.0 and later. Each platform has different maximum aggregate capacities for 64-bit aggregates. The following recommendations are based on attempting to provide the optimal RAID group layout, as explained in the answer to the question “Should all aggregates always use the default RAID group size?” earlier.

The column descriptions for the following tables are as follows:

  • “Data Drives” is the number of data drives that fit within the maximum aggregate capacity (based on usable drive capacity).
  • “RG Size” is the recommended RAID group size to use for the configuration.
  • “Number of RGs” is the resulting number of RAID groups the aggregate will contain.
  • “Drive Def.” is the number of drives deficient the configuration is from achieving event RAID group layout.
  • “Data + Parity” is the total number of drives used for the aggregate configuration.

The top entries in the following tables show the recommendations for aggregate configurations that are using the maximum number of data drives. In many cases it is better to reduce data drives by a small number in order to achieve a better RAID group layout, as indicated by the bottom numbers, in parentheses.

64-Bit Aggregate Recommendations for FAS2040

Data ONTAP 8.0.x Maximum Aggregate Capacity 30TB

Capacity

Type

Data Drives

RG Size

Number of RGs

Drive Def.

Data + Parity

100GB

SSD

86

(84)

24

(23)

4

(4)

2

(0)

94

(92)

300GB

FC

115

(112)

15

(18)

9

(7)

2

(0)

133

(126)

450GB

75

(75)

17

(17)

5

(5)

0

(0)

85

(85)

600GB

56

(54)

16

(20)

4

(3)

0

(0)

64

(60)

300GB

SAS

115

(112)

15

(18)

9

(7)

2

(0)

133

(126)

450GB

75

(75)

17

(17)

5

(5)

0

(0)

85

(85)

600GB

56

(54)

16

(20)

4

(3)

0

(0)

64

(60)

500GB

SATA

74

(72)

17

(20)

5

(4)

1

(0)

84

(80)

1TB

37

(36)

15

(20)

3

(2)

2

(0)

43

(40)

2TB

18

(18)

20

(20)

1

(1)

0

(0)

20

(20)

64-Bit Aggregate Recommendations for FAS/V3040, 3140, 3070, 3160, 3210, and 3240

Data ONTAP 8.0.x Maximum Aggregate Capacity 50TB

Capacity

Type

Data Drives

RG Size

Number of RGs

Drive Def.

Data + Parity

100GB

SSD

86

(84)

24

(23)

4

(4)

2

(0)

94

(92)

300GB

FC

192

(192)

18

(18)

12

(12)

0

(0)

216

(216)

450GB

125

(119)

20

(19)

7

(7)

1

(0)

139

(133)

600GB

93

(90)

18

(20)

6

(5)

3

(0)

105

(100)

300GB

SAS

192

(192)

18

(18)

12

(12)

0

(0)

216

(216)

450GB

125

(119)

20

(19)

7

(7)

1

(0)

139

(133)

600GB

93

(90)

18

(20)

6

(5)

3

(0)

105

(100)

500GB

SATA

123

(119)

20

(19)

7

(7)

3

(0)

137

(133)

1TB

61

(60)

18

(17)

4

(4)

3

(0)

69

(68)

2TB

30

(30)

17

(17)

2

(2)

0

(0)

34

(34)

64-Bit Aggregate Recommendations for FAS/V3170, 3270, 6030, 6040, and 6210

Data ONTAP 8.0.x Maximum Aggregate Capacity 70TB

Capacity

Type

Data Drives

RG Size

Number of RGs

Drive Def.

Data + Parity

100GB

SSD

86

(84)

24

(23)

4

(4)

2

(0)

94

(92)

300GB

FC

269

(255)

20

(19)

15

(15)

1

(0)

299

(285)

450GB

175

(170)

18

(19)

11

(10)

1

(0)

197

(190)

600GB

131

(126)

14

(20)

11

(7)

1

(0)

153

(140)

300GB

SAS

269

(255)

20

(19)

15

(15)

1

(0)

299

(285)

450GB

175

(170)

18

(19)

11

(10)

1

(0)

197

(190)

600GB

131

(126)

14

(20)

11

(7)

1

(0)

153

(140)

500GB

SATA

173

(170)

18

(19)

11

(10)

3

(0)

195

(190)

1TB

86

(85)

20

(19)

5

(5)

4

(0)

96

(95)

2TB

43

(36)

13

(20)

4

(2)

1

(0)

51

(40)

64-Bit Aggregate Recommendations for FAS/V6070, 6080, 6240, and 6280

Data ONTAP 8.0.x Maximum Aggregate Capacity 100TB

Capacity

Type

Data Drives

RG Size

Number of RGs

Drive Def.

Data + Parity

100GB

SSD

86

(84)

24

(23)

4

(4)

2

(0)

94

(92)

300GB

FC

385

(384)

13

(18)

35

(24)

0

(0)

455

(432)

450GB

250

(240)

12

(18)

25

(15)

0

(0)

300

(270)

600GB

187

(180)

19

(20)

11

(10)

0

(0)

209

(200)

300GB

SAS

385

(384)

13

(18)

35

(24)

0

(0)

455

(432)

450GB

250

(240)

12

(18)

25

(15)

0

(0)

300

(270)

600GB

187

(180)

19

(20)

11

(10)

0

(0)

209

(200)

500GB

SATA

247

(240)

15

(18)

19

(15)

0

(0)

285

(270)

1TB

123

(119)

20

(19)

7

(7)

3

(0)

137

(133)

2TB

61

(60)

18

(17)

4

(4)

3

(0)

69

(68)

Notes for the preceding tables:

  • 100GB SSD capacity first supported with Data ONTAP 8.0.1

  • 600GB FC/SAS capacity first supported with Data ONTAP 7.3.2 and 8.0 RC3/GA
  • 2TB SATA capacity first supported with Data ONTAP 7.3.2 and 8.0 RC3/GA

Note that the capacity points for 600GB FC/SAS and 2TB SATA are supported in Data ONTAP 7.3.2; however, 64-bit aggregates are supported only in Data ONTAP 8.0 and later.

</64-bit aggregates>

Again, the two copied and pasted sections above are not confidential --- although the docs they are contained within are NDA required (for other information that is contained within).  For aggregates that are not maximum size you can figure this out based on using the sizing policy.  If you have resiliency concerns, that is factored into the recommendations --- more can be read (publically) at http://www.netapp.com/us/library/technical-reports/tr-3437.html.  TR3437 is the Storage Subsystem Resiliency Guide (updated a couple weeks ago) and has information that will help explain some of the background here.

And lastly, for 32-bit aggregates:

<32-bit aggregates>

HOW MANY DRIVES CAN BE USED IN A MAXIMUM SIZE 32-BIT AGGREGATE?

In Data ONTAP 7.2.x and earlier, parity drives and physical drive size are included in the 16TB limit for 32-bit aggregates.

In Data ONTAP 7.3.x and 8.0.x, only data drives and usable drive capacity are included in the 16TB limit for 32-bit aggregates. The following recommendations are based on attempting to provide the most optimal RAID group layout, as explained in the answer to the question “Should all aggregates always use the default RAID group size?” earlier.

The column descriptions for the following tables are as follows:

  • “Data Drives” is the number of data drives that fit within the maximum aggregate capacity (based on usable drive capacity).
  • “RG Size” is the recommended RAID group size to use for the configuration.
  • “Number of RGs” is the resulting number of RAID groups the aggregate will contain.
  • “Drive Def.” is the number of drives deficient the configuration is from a fully even number of RAID groups.
  • “Data + Parity” is the total number of drives used for the aggregate configuration.

The top entries in the following table for the SSD drive show the recommendations for aggregate configurations that are using the maximum number of data drives. In many cases it is better to reduce data drives by a small number in order to achieve a better RAID group layout, as indicated by the bottom numbers, in parentheses.

32-Bit Aggregate Recommendations for All Platforms

Data ONTAP 7.3.x and 8.0.x

Capacity

Type

Data Drives

RG Size

Number of RGs

Drive Def.

Data + Parity

100GB

SSD

86

(84)

24

(23)

4

(4)

2

(0)

94

(92)

300GB

FC

61

(60)

18

(17)

4

(4)

3

(0)

69

(68)

450GB

40

(40)

12

(12)

4

(4)

0

(0)

48

(48)

600GB

29

(28)

17

(16)

2

(2)

1

(0)

33

(32)

300GB

SAS

61

(60)

18

(17)

4

(4)

3

(0)

69

(68)

450GB

40

(40)

12

(12)

4

(4)

0

(0)

48

(48)

600GB

29

(28)

17

(16)

2

(2)

1

(0)

33

(32)

500GB

SATA

39

(39)

15

(15)

3

(3)

0

(0)

45

(45)

1TB

19

(18)

12

(20)

2

(1)

1

(0)

23

(20)

2TB

9

(9)

11

(11)

1

(1)

0

(0)

11

(11)

The preceding table applies to all platforms with one exception, in regard to the FAS2020 platform. In Data ONTAP 7.3.1 and later, the maximum aggregate capacity for the FAS2020 is 16TB, just as with other platforms. In Data ONTAP 7.3, the actual aggregate size limit for the FAS2020 is 6TB, counting only data drives. Finally, in Data ONTAP 7.2 and earlier, the limit for the FAS2020 is 7TB, counting both data and parity drives. NetApp highly recommends that all FAS2020 systems use Data ONTAP 7.3.1 or later in order to avoid confusion when managing your storage configuration.

Notes for the preceding table:

  • 100GB SSD capacity first supported with Data ONTAP 8.0.1
  • 450GB FC/SAS capacity first supported with Data ONTAP 7.2.5.1
  • 600GB FC/SAS capacity first supported with Data ONTAP 7.3.2 and 8.0 RC3/GA
  • 1TB SATA capacity first supported with Data ONTAP 7.2.3
  • 2TB SATA capacity first supported with Data ONTAP 7.3.2 and 8.0 RC3/GA

</32-bit aggregates>

Note that when figuring out how many drives you can use for your aggregates you need to reduce the total drive count to the number of drives that are available for the aggregate (minus dedicated root aggregate drives and hot spares).

Hopefully this helps.

Regards,

Jay White

Technical Marketing Engineer

Storage, RAID, and System Resiliency

SMNKRISTAPS
9,076 Views

Hello,

Is there updated table for year 2013?

mheimberg
9,076 Views

year 3013? Your looking long forward ...

If you are a partner employee you may donwload the latest "Technical FAQ: Storage Subsystem" from the fieldportal.netapp.com with all the new drives and sizes.

Markus

ERIC_TSYS
10,543 Views

Slighty off topic but nevertheless important: I have read in this thread and had several conversations with techies that recommend ending up with 1 spare disk whilst using

RAID-DP.

I dont get it, why cater for RAID-DP if you re not going to cater for enough spares?

I have seen/heard this statement so many times now that I am concerned I ve missed out on something, can someone explain the logic please?

Cheers,

Eric

paulstringfellow
10,543 Views

Hi Eric,

What you not sure about…

“Good practice” is 2 spare drives, per drive type, per controller…however technically it’s not a necessity to do it… the smaller the setup however the more this practice is compromised for capacity over “ultra” resilience…

But if you’re not sure drop us a note back and I’ll advise if I can…

NETAPPCYP
10,543 Views

Hi Eric,

The idea is that the system has NBD or 4 Hrs Parts delivery and in the event that a second disks fails and spare disk will have arrived at your door.

Best Practice is to have 2 or more spares, but for smaller systems and clients, generally they leave 1 spares, to get the most usuable storage from the disks they have available. Without compramising data protection.

Just a question are you based at TSYS is cyprus?

ERIC_TSYS
10,543 Views

hi Paul,

I just struggle with the logic of having 2 parity disks and not having enough disks to cater for the protection you ve put in place.

I know that for smaller systems it may come to this but in this case there are heaps of disks to allow for 2 spares.

I agree that 4 hours or NBD CAN be good enough but if a disk failed I d like to have it rebuilt and still have 1 spare disk

whilst the new disk is on its way. Too many times I ve seen dissk fail out of hours, delayed deliveries, disks fail on weekends

when nobody is there to fix the issue etc. In cases like these you re exposed with 1 spare disk.

Cheers,

Eric based in UK, I wish I was in Cyprus!

radek_kubka
10,543 Views
I just struggle with the logic of having 2 parity disks and not having enough disks to cater for the protection you ve put in place.

Well, you can reverse the logic as well: why 'lose' 2 disks for parity & then another 2 for spares (per controller, per disk type)? What's the likelihood of losing simultaneously 2 drives in the aggregate & one hot spare?

As mentioned already in this thread, 2 hot spares are required for so called Maintenance Center (http://media.netapp.com/documents/tr-3437.pdf, p10), but on smaller systems it may be deemed to be a luxury

Regards,

Radek

ERIC_TSYS
9,997 Views

I agree it can be seen as a luxury on smaller system but then again.. smaller system or not: I d suggest to you data is what is important, not what system it sits on.

Your most important data could be sitting on a 10GB file system rather than a 10TB filesystem. Hence what FAS model data is sitting on is a bit of a moot point for me.

Also, I dont see raid protection as a cost, its an investment agains loss of your most important asset.

Lastly going back to my example: a disk fails on a friday night, you ve got 1 spare disk, it rebuilds. Now you have no spare over the weekend. how does that sit with you?

I know where it sits with me, for the sake of 1 disk its not worth the risk.

I guess from your answers its a matter of opinion/risk assesment rather than a technical issue. Thats fine, no worries.

Cheers,

Eric

Public