Kind of new to all of this, but I am looking for general guidelines and any gotchas that may come with initial sizing of aggregates and raid groups.
I realize every situation is different, and that *true* best practice is to do an in-depth analysis of your applications, define the necessary IO required and build your SAN out that way. All well and good. That said -- is it typically better to have really large aggregates, or smaller ones, around 28 disks? One of our partners says that 28 disk aggregates is best practice, where other SEs say that it is best to have as large of an aggregate as possible.
Say we are getting 3 new disk shelves for an active/active 3040 cluster. Would it be best to use all 3 shelves in 1 aggregate? Have 2 aggrs, one that is 2 shelves and 1 that is 1 shelf?
What is best practice for raid group sizes?
Obviously we would be looking for the best way to carve this up with regard to performance, space-efficiency, fault-tolerance etc.
Any thoughts or real world configs would be much appreciated
It really does depend on the operation, but splitting the disks evenly is probably the best way to start out if you don't know the load you will be putting on the system. This could make expansion a little visually confusing though.
Remember that as the systems are active/active, you can't have one shared aggregate. For all intense purposes, they are 2 independent storage arrays acting together. Each system must have it's own aggregate, so regardless, the disks will be split across the 2 heads. Software ownership makes this easy (so you can have one disk chassis shared between both heads).
For simplicity of expansion, and if you are only going for 3 disk shelves, I may be tempted to have 2 shelves on one head, 1 on the other head. Then you can lay the disks out physically in a more logical way. This makes future expansion a lot simpler and assigning disks less of a headache. But this comes down to preference and whether you want an easy life in administration of the systems, or even loading of the storage.
Also, when having active/active systems, make sure to spread the load across both systems. So for example, DB1 on filer1, logs1 on filer2, then DB2 on filer2 and logs2 on filer1. Try balance the databases if the systems are loaded equally so that both systems are providing the same throughput and getting the same load put on them.
We have a 3070 cluster and found for SQL server it was best to use software ownership and split the 3 shelves between the two filers. The I/O for read was about the same but write performance was double, due to the NVRAM cache of two filers.
If you are still in dev try it out for yourself and run MS SQLIO.EXE to create your load and see what works best for you.