ONTAP Discussions

2 controllers, 4 shelf each, 24 SAS drives each shelf, How many aggregates I should have?

netappmagic
10,115 Views

should I have a big aggregate each controller containing all all 4x[23-24] drives and multiple RG, or multiple aggregates?

it is on 2x3240 HA, 64-bit aggr, ontap 8.2.x,and for NAS shares. 3 drives from 3 shelv respectively have been allocated for aggr0 already.

600GB each raw

Thanks for your advice.

1 ACCEPTED SOLUTION

resqme914
10,062 Views

Assuming 7-mode...

I would have one big aggregate on each controller, which means I would expand aggr0 to become 4 raidgroups of rg size 23, raid-dp.  Reasons:

1.  Hate wasting 3 drives for just vol0, especially if the drives, which you didn't mention, are large-capacity (e.g. 900GB SAS).

2.  We use rg size 23 because we tend to add storage by shelves, and it's a lot simpler to grow the aggregates by entire shelves.  Plus the disk drives keep getting bigger and bigger, so it's just really easier to deal with entire shelves at a time.

3.  We like larger aggregates, and less number of aggregates.  Lots of spindles for performance.  We like to have one SAS aggregate and one SATA aggregate.  You don't have to deal with aggregates running out of space and you have to migrate a volume to another aggregate, etc etc.

4.  Less wasted disks drives.

We had NetApp consultants here for three months and this was one of their recommendations.  It took me a lot of work merging aggregates (one filer pair went from about 20 aggregates or so, down to 4) and we're really happy we did this.

View solution in original post

41 REPLIES 41

HENRYPAN2
7,575 Views

Hi netappmagic,

You may choose a small AGGR with 3 drives for DOT data + a big AGGR with 4 x24 -3 -4 (spares) drives

Good luck

Henry

netappmagic
7,576 Views

would you please explore your idea more as for pros and cons on a big AGGR vs a few AGGR's

Thank you!

HENRYPAN2
7,576 Views

Sure netappmagic,

Better performance.

Good luck

Henry

resqme914
10,063 Views

Assuming 7-mode...

I would have one big aggregate on each controller, which means I would expand aggr0 to become 4 raidgroups of rg size 23, raid-dp.  Reasons:

1.  Hate wasting 3 drives for just vol0, especially if the drives, which you didn't mention, are large-capacity (e.g. 900GB SAS).

2.  We use rg size 23 because we tend to add storage by shelves, and it's a lot simpler to grow the aggregates by entire shelves.  Plus the disk drives keep getting bigger and bigger, so it's just really easier to deal with entire shelves at a time.

3.  We like larger aggregates, and less number of aggregates.  Lots of spindles for performance.  We like to have one SAS aggregate and one SATA aggregate.  You don't have to deal with aggregates running out of space and you have to migrate a volume to another aggregate, etc etc.

4.  Less wasted disks drives.

We had NetApp consultants here for three months and this was one of their recommendations.  It took me a lot of work merging aggregates (one filer pair went from about 20 aggregates or so, down to 4) and we're really happy we did this.

netappmagic
7,575 Views

Thank you for your valuble inputs.

One more question. There will be about 8TB space and used for a backup/dumping data over, should we seprate this space and create 2nd aggregate for it? the consideration is that backup data has less performance requirement, and should not compete the I/O with other data onto the same aggregate. What is your opinion?

nigelg1965
7,664 Views

Hi

I think most Netapp users would favour reqqme914's advice over Henry's.

My added advice would be, beyond the first 20 or so, only allocate the number of disks you need to an aggregate (ensuring they're at most 80% used/ allocated), Nothing's more annoying than having to a half empty aggregate on one head and being tight for space on the other.

netappmagic
7,664 Views
Hi nigelg1965,

I just wanted to make sure I understand you correctly. So, you share our concern that backup data could compete I/O with primary data, therefore,  are you saying that we should create  2nd aggregate (put 20 disks as you suggested), and this aggregate would be used for 8TB backup data, and it should not exceed 80% of usage?

resqme914
7,664 Views

Ideally, you would've had a separate SATA aggregate for your dumps, backups, etc., but you don't, so...  what I would do, if I were in your position, would be to stick to the one large aggregate anyway and have more spindles for all your workloads and have the I/Os spread across all disks, and all the other benefits I've previously enumerated.

netappmagic
7,664 Views

Hi Resqme914, Thanks again for your messages.

The problem is that there is aggr0 already created, and contained 3/raid-dp disks.

How should I do from here? Should I blow away this aggr0, and set up a new BIG aggr0, that would require redo everything already on current aggr0, including reinstall OnTap.  Isn't? Or should I just leave aggr0 as it is, and create a BIG aggr1 with rest of other disks? Any suggestion on that?

resqme914
6,914 Views

I thought you were starting from scratch with a new filer pair.  If you haven't created aggregates yet, then you could just add disks to aggr0 and grow it.  I'd do it by hand (CLI) so you have full control over the raidgroups etc.  If you've already created other aggregates, then I'd recommend, leaving aggr0 alone and having a big aggr1, and when you get better with Ontap, look into migrating the root vol, vol0, to aggr1 then destroying aggr0 to free up the disks.  Or just leave aggr0 alone if you're not comfortable doing that.  If you do though, you'd be gaining 6 600GB drives.  Hope that helps.

netappmagic
6,914 Views

3 disks from 3 different shelve have already been setup as aggr0.

based what have discussed so far, I am planning on setting up 5 raid-dp, so, there will be 10 parity + 3 hot spare + 80 data = 93 disks.

if I create one and only one BIG aggr1, merging aggr0 into aggr1, I will then be gaining 3x600GB. Why did you say that I'd be gaining 6x600GB?

resqme914
6,914 Views

I said if you got rid of your aggr0's, you'd be gaining 6x600GB disks because you have two filers and each one has an aggr0 composed of three disks...  3 disks x 2 filers = 6 disks.

netappmagic
6,914 Views

if I remove aggr0, i'd have to rebuild ontap which may sound a lot. so, for now, I probably leave aggr0, and create a BIG aggr1.

You hvae been very helpful. I appreciate all your messages.

resqme914
6,914 Views

No, please don't reinstall Ontap.  Like I said in a previous message... there is a way to move the root vol, vol0, to another aggregate.  Assuming you didn't create any other volume in aggr0, there should be just one volume there called vol0.  That is the system volume.  All you need to do is move that root volume to aggr1, then you should be able to destroy aggr0.  No need to reinstall Ontap.  If you search the NetApp Support site, you should be able to easily find the procedure to move the root volume.  But as I mentioned before, maybe you should wait a while until you are more knowledgeable with Ontap and will be comfortable with doing these steps.  These steps involve filer reboots.

Also, if you haven't created your big aggr1 yet... why not just add all the disks to aggr0 and make a big aggr0?  If you do that, make sure you get your raidgroups and rgsize correct, because it will be impossible to undo your aggr add.

Here is an old document I wrote up for other storage admins in my company.  This will give you an idea of how to move your root volume (your mileage may vary)...

Steps to Move Root Volume to a New Volume/Aggregate

  1. Created the new aggregate (called it aggr1).  I also did this already on the second filer.
  2. Created a 200GB volume on aggr1 called vol1.
  3. Set this volume restricted.
  4. Resized the current root volume vol0 to 200GB to match the new volume.
  5. Used vol copy to copy the current root to the new root.
  6. Online the new volume, vol1
  7. Renamed the current root, vol0, to old_vol0
  8. Renamed the new volume, vol1, to vol0
  9. Set the new volume as the root volume
  10. Reboot

Pain in the neck steps…

  1. Recreate new CIFS shares (C$, ETC$, HOME) using same characteristics as old shares
  2. Delete old CIFS shares
  3. Delete old root volume and aggregate once satisfied with results.
  4. Rename aggr1 to aggr0 (if desired).

Log follows...  (excluding reboot)

SAN-1ELS01> vol copy start vol0 vol1

This can be a long-running operation. Use Control - C (^C) to interrupt.

Copy Volume: vol0 on machine 127.0.0.1 to Volume: vol1

VOLCOPY: Starting on volume 1.

08:26:48 PDT : vol copy restore 0 : begun, 675 MB to be copied.

08:26:48 PDT : vol copy restore 0 : 5 % done. Estimate 1 minutes remaining.

08:26:50 PDT : vol copy restore 0 : 10 % done. Estimate 1 minutes remaining.

08:26:56 PDT : vol copy restore 0 : 15 % done. Estimate 1 minutes remaining.

08:27:02 PDT : vol copy restore 0 : 20 % done. Estimate 1 minutes remaining.

08:27:08 PDT : vol copy restore 0 : 25 % done. Estimate 2 minutes remaining.

08:27:14 PDT : vol copy restore 0 : 30 % done. Estimate 2 minutes remaining.

08:27:19 PDT : vol copy restore 0 : 35 % done. Estimate 1 minutes remaining.

08:27:19 PDT : vol copy restore 0 : 40 % done. Estimate 1 minutes remaining.

08:27:20 PDT : vol copy restore 0 : 45 % done. Estimate 1 minutes remaining.

08:27:21 PDT : vol copy restore 0 : 50 % done. Estimate 1 minutes remaining.

08:27:21 PDT : vol copy restore 0 : 55 % done. Estimate 1 minutes remaining.

08:27:21 PDT : vol copy restore 0 : 60 % done. Estimate 1 minutes remaining.

08:27:22 PDT : vol copy restore 0 : 65 % done. Estimate 1 minutes remaining.

08:27:22 PDT : vol copy restore 0 : 70 % done. Estimate 1 minutes remaining.

08:27:22 PDT : vol copy restore 0 : 75 % done. Estimate 1 minutes remaining.

08:27:23 PDT : vol copy restore 0 : 80 % done. Estimate 1 minutes remaining.

08:27:23 PDT : vol copy restore 0 : 85 % done. Estimate 1 minutes remaining.

08:27:24 PDT : vol copy restore 0 : 90 % done. Estimate 1 minutes remaining.

08:27:24 PDT : vol copy restore 0 : 95 % done. Estimate 1 minutes remaining.

08:27:25 PDT : vol copy restore 0 : 100% done, 675 MB copied.

SAN-1ELS01>

SAN-1ELS01> vol status

         Volume State           Status            Options

           vol0 online          raid_dp, flex     root

                                64-bit

           vol1 restricted      raid_dp, flex

                                64-bit

SAN-1ELS01> vol online vol1

Volume 'vol1' is now online.

SAN-1ELS01> vol rename vol0 old_vol0

Renaming volume vol0 (fsid 77c0510e) to old_vol0: start time 922405

Wed JunWed Jun  6 08:28:19 PDT [SAN-1  6 08:ELS01:wafl.vvol.renamed:info]: Volume 'vol0' renamed to 'old_vol0'.

28:19 PDT [SAN-1ELS01:wafl.vvol.renamed:info]: Volume 'vol0' renamed to 'old_vol0'.

'vol0' renamed to 'old_vol0'

SAN-1ELS01> vol rename vol1 vol0

Renaming volume vol1 (fsid 42bd70f1) to vol0: start time 928432

  6 08:'vol1' renamed to 'vol0'

SAN-1ELS01> Wed Jun 6 08:28:25 PDT [SAN-1ELS01:wafl.vvol.renamed:info]: Volume 'vol1' renamed to 'vol0'.

28:25 PDT [SAN-1ELS01:wafl.vvol.renamed:info]: Volume 'vol1' renamed to 'vol0'.

SAN-1ELS01> vol options vol0 root

Wed JunWed Jun  6 08:28  6 08::37 PDT [SAN-1ELS01:fmmb.lock.disk.remove:info]: Disk 0a.00.0 removed from local mailbox set.

28:37 PDT [SAN-1ELS01:fmmb.lock.disk.remove:info]: Disk 0a.00.0 removed from local mailbox set.

Wed JunWed Jun 6   6 08:08:28:37 PDT [SAN-1ELS01:fmmb.lock.disk.remove:info]: Disk 0a.00.2 removed from local mailbox set.

l become root at the next boot..disk.remove:info]: Disk 0a.00.2 removed from local mailbox Volume 'vol0' wilset.

SAN-1ELS01>

Wed JunWed Jun  6 08:2  6 08:8:37 PDT [SAN-1ELS01:fmmb.current.lock.disk:info]: Disk 0b.01.0 is a local HA mailbox disk.

28:37 PDT [SAN-1ELS01Wed Jun  6 08:28:37 PDT [SAN-1ELS01:fmmb.current.lock.disk:info]: Disk 0b.02.0 is a local HA mailbox disk.

:fmmb.current.lock.disk:info]: Disk 0b.01.0 is a local HA mailbox disk.

Wed Jun  6 08:28:37 PDT [SAN-1ELS01:fmmb.current.lock.disk:info]: Disk 0b.02.0 is a local HA mailbox disk.

netappmagic
6,825 Views

Thank you very much for sharing, resqme914!

netappmagic
6,379 Views

to complete this entire project:

Seeking for commands to create a BIG aggr(64bit), including all unused disks on the filer, without specififying any disk names?

I have 4 shelves, and 23-24 disk each, the aggr will include 5RG's DP, and 3 hot spares.

Also, after some tests, i would probably destroy the aggr, would this have any impact to allow me to re-create?

Thank you!

netappmagic
6,468 Views

Hi resqme914 ,

sorry to bother you again on this topic.

now, I have total of 78 data sas drives left. You mentioned that I can add all these drives into existing aggr0 then I don't need to move vol0.

My question is, since there is already a raid0 in aggr0 with size of 3 drives, how many raid-dp groups I need to create in aggr0, and how to size them? Here is the step I can think of, please correct me.

1.  expand raid0 first by adding 18 more drives. So, raid0 will include root vol0. Then how do I determine what those of 18 drives out of 78 should be added in?

2. create the rest of 3 more raid-dp groups with the size of 20 each.

Also, with the size of 20 / 21 of raid-dp, would that satisfy the best practice?

Thank you, as always.

resqme914
6,469 Views

I don't have a clear picture of your filers so I have to go on some assumptions.  You mentioned "raid0" but I think you meant rg0 (raidgroup 0).  You also mentioned you have 78 drives you want to use to expand aggr0... do you have spare drives not counting those 78?  Assuming that you do want to use 78 drives to expand your 3-drive aggr0 (that makes a total of 81 drives), I would recommend that you have 4 raidgroups of 20 drives each, raid-dp.

So, I would recommend the following steps (assuming all your disks are the same size, type and speed):

1.  Change raidsize of aggr0 to 20.  Make sure this is absolutely what you want and not smaller.  You won't be able to decrease the aggregate's raidsize in the future.  CLI command would be:  filer> aggr options aggr0 raidsize 20

2.  Expand rg0 by adding 17 drives.  CLI command would be  filer> aggr add aggr0 -g rg0 17

3.  Add rest of the drives.  CLI command:   filer>  aggr add aggr0 60

You will have one spare drive left to add to your spares.

Feel free to ask more questions if you are unsure of anything.  It's better to ask for clarification than to have to fix a mistake that will probably be difficult to fix.


netappmagic
6,469 Views

Hi resqme914,

I have 78 data disks left (3 disks for spares) is because I would have to use 12 other disks for 2nd aggr. I know creating 2 aggr is not an optimal approach. However, we have a backup and this part of data hit the filer's performance, so, all people here wanted to separate the backup data from the other production data.

Now I have another sceneario need your advice:

On DR site. we have two heads, each head has 2 shelves and 24 drives each. I am thinking to create one aggr on each filers and one shelf for each raid-dp with 24 disks, would that be against the best practice? I heard the best number for raid group would be 12-20 disks.

What would you say?

resqme914
6,331 Views

My recommendation is to have raidgroup size of 23.  Basically translates to using one shelf per raidgroup (and with a spare disk per shelf).  The reason we do this is because over time, you tend to add more disks to your filers and you tend to add them by the shelf.  Makes it easier to add your new disks into existing aggregates.

Hope that helps.

Public