Raid group config for aggregate

JSTANDER79 · ‎2013-03-26

Hi guys

I know this topic has been abused to death but I added my first add-on disk shelf to current FAS 2240 and have some questions and concerns which I hope you can help me out with or at least confirm my results.

I have an aggregate with Raid-DP and Raid group size of 16, which netapp sets by default and is recommended.

This worked fine when we split the 24 drives between each aggr on each controller, which created the following:

Raid group for aggr (1 spares): 2 x parity + 8 x data

Not we added an extra 12 drives to this aggr which created the 2 x Raid groups:

Raid group 1: 2 x parity + 14 x data

Raid group 2: 2 x parity + 4 x data

This is of course not optimal, so how can fix this since we will not be purchasing any new shelve soon.

For sake of my own sanity I want to confirm my finding and see what you might think?

1. Leave aggregate as is?

2. Change the Raid group to 11 disks?

2 x parity + 9data = 2 Raid groups (this way it will spread the data disks evenly between RAID groups and when I add another shelve with 12 disks it will create another Raid group with similar disks layout.

does this require a reboot of controller where aggregate resides on?

3. Increase original raid Group to 22 disks?

Not sure this is possible since 2nd raid group already created and don't think possible to move drives from raid group 2 to one.

Lastly, currently out 2 x spare drives only sit on the primary shelf and we don't have any on the new shelf.

would this be an issue if a disk fails on the 2nd shelve?

if issue, how can I change this to have a single drive on each shelve as spare when I already added all the drives on 2nd shelve to aggregate?

Thanks in advance

Johann

billshaffer · ‎2013-03-26

If you've already added the drives to the aggregate to get the two raid groups (14+2 and 4+2), there's nothing really you can do. You can't redo the aggregate without destroying it - so if you've got enough temp space and the means to do two migrations, you can still manage it.

If, before you added the new drives, you had set the RG size to 11, you would have two 9+2 groups with the same spares you had before, and then future 12-disk additions would each create a new RG plus a spare. But if you change the RG size to 11 now, you'll still have the 14+2 and the 4+2, and the next 12-disk expansion will give you a 14+2, a 9+2, and 7 spares (or a 5+2). This is like your #2, but you can't change the current layout without migrating the data off.

Spares do not have to be in the same shelf as disks that fail - they just have to be owned by the same head and in the same pool (if you use pools) - BUT, remember that you need spares on both heads.

Hope that helps....

Bill

JSTANDER79 · ‎2013-03-27

Thank you Bill.

Appreciate the help on this.

So with our current configuration with unbalanced Raid group, do you foresee any performance problems with this since we have so few data drives within the 2nd RD or what other consequences can I expect from this?

I currently have 2 spares on each controller, meaning a total of 4 spares for the single SAS shelf since we have split the disks on shelf 00 between the 2 controllers equally. Is this overkill and can we just get away safely with a single spare on each controller and add an extra disk to each aggregate?

Cheers

Johann

billshaffer · ‎2013-03-27

I would not be surprised if you see some performance issues on the small RG, but there are many variables involved - usage profile, volume allocation, etc. That would be the only consequence that I can see. You may be able to mitigate this by running vol reallocate (which you should do initially anyways) frequently, but I have not had any experience there.

NetApp used to have a recommended # of spares based on how many data/parity drives you had, but I couldn't find that. Two spares per 18 drives per head doesn't strike me as overkill. You _could_ take that down to one spare each head...but frankly I'd be a bit uneasy if that were my environment. That being said, with one spare per head, you would be able to survive a 3 disk failure scenario in one raid group. The chances of that are pretty low. If you have 4 hour hardware support, it'd be something to consider, since the first drive failure will be spared out and replaced relatively quickly.

Bill

JSTANDER79 · ‎2013-03-27

Thanks.

Yip running the reallocate on volumes as we speak so will keep an eye on that.

I did ask the technician who installed the new shelves about the RG's and he said you cannot change the groups without a reboot. So he suggested we change the RG's drive amount right before we shutdown the FAS since we will be moving it within a month or two. Is this really the case and can we do this?

Appreciate the help and will just leave two spares even though I still feel might be a bit overkill as you mentioned on drive failures, but since I run exchange and sharepoint no this I would rather play save than sorry.

Cheers

billshaffer · ‎2013-03-27

You can change this at any time - aggr options <aggr> raidsize <num>. It won't change anything current, but will take effect when you try an aggr or create new ones. I've done this on the fly just before adding drives to an aggregate, and it took effect instantly (this was untold years ago on early ONTAP 6 (god, maybe even 5....), but I can't believe they would make it need a reboot.

Bill

pedro_rocha · ‎2013-03-26

Hello Johann,

Agree with Bill, you do not have much to do now, unless you are able to migrate data. I would just tell you 2 more things:

Yes, I would work with 22 (20+2) for the RG size (considering you are working with SAS disks. If SATA, try to be under 20 (18+2)). This would give you more space considering the number of disks you have.
You can use the "disk replace" to put one spare disk on the other shelf. Check its man page. But, there is no actual benefit in doing that for your structure. It would be a cosmetic change.

Regards,

Pedro Rocha.

JSTANDER79 · ‎2013-03-27

Thanks Pedro. makes sense and will just leave the spares as they are since mine is currently owned by different controllers.