I'm looking at adding an AFF8080 with 3 x 24 x 1.6TB SSD shelves. What would be the best aggregate and spare configuration for maximum capacity to start and being able to add another 4th shelf later?
We were thinking of assigning all odd # disks to node1 and even # disks to node 2 which would be 36 disks on each node. Then we would probably add another 24 disks to that HA pair and would have another 12 disks per node (total = 48) later.
How would you configure the aggr's? how many spares would you use and what size raid groups?
mainly because I wouldn't want all my egss (drives) in one basket. Generally the larger the aggregate the longer it will take to rebuild a failed drive - it is perfectly feasable to create 1 large aggregate on each node too. SSD's tend not to fail as often as HDD so either option is fine.
Asisde form that if you use NetApp Synergy (which I did with yours) to build out your cluster, you can tell it to recommend NetApp best practice for the aggregate configuration.
From my knowledge raid rebuild times are based on the size and speed of the disk, not the size of the aggregate. But I agree, split your aggregates across the two nodes. Especially if you are using it for SAN.
I'm planning on using it for mostly Fiber Channel luns, but also some CIFS shares with SMB3, and some NFS. Older best practices said to split the luns and CIFS/NFS into separate aggregates. Is this still true with AFF SSDs aggregates?
AS all your aggreagtes are all flash it wont make any difference where you place your NAS or SAN data, it will all benefit from the performance drives. you could have cifs volumes shared out to cifs clients and FC LUNs which reside in volumes on the same aggregate as your cifs vols with no performance impact.
There are a number of factors affecting raid rebuild times including size and speed of the drives however, the number of drives in a RAID group, raid.reconstruct.perf_impact options (setting this option to high reduces the time it take to complete a RAID reconstruction and reduces foreground I/O performance.) TR3838 and Raid rebuild times are highly dependent on the workload profile on the system.
The conventional rule that smaller raid groups would reconstruct "no slower" (and likely faster) than larger raid groups applies. Thus, it aligns with NetApp recommendation of staying close to the default value provided by ONTAP.
I think it would be beneficial to work with your NetApp or partner SE to validate this solution. It is an uncommon use-case, and you need a flashcache card per controller, and it's a financial outlay, so we'd want you to do it with as much input as possible before making the decision to purchase.
We have a process called "Feature Product Variance Request" (FPVR), which can be used to add/change functionality of products - if this was a suitable and appropriate solution for you, I believe it would only be available through that process, which includes engineering validation and support.