When an aggregate is substantially in use, adding disks to an aggregate adds proportionally more blocks on the free list from the new disks as compared to free blocks from existing disks. Item #2 in your post is correct. As more blocks on the free list are concentrated on the new disks, those disks tend to be selected for new data more than existing disks, and thus tend to be more used. With spinning disks, this can be an issue as each disks might only have 150 IOPs (or so, assuming 10K disks) to provide to the data load.
One mitigating factor is rate of growth. If your data set is primarily growth (new data), the effect of adding disks is magnified. If disk use is mostly change with just a little growth, then balance might be achieved quickly, depending on the rate of change as blocks from the older disks start to free up. We use volume reallocation to speed up the balancing effect of natural data use because a hot spinning disk is a significant choke point.
Of course, those are the general rules for spinning disk. SSD changes the calculus a bit because of the order of magnitude change in IOPS capability. The same discussion with respect to blocks on the free list apply. However, a half shelf of SSDs can still potentially outpace the ability of the controller to move data on the SAS links (depending SSD disk type and SAS link speed). While you would still see a the new disks hotter than average than the original if you tracked individual disk utilization hour by hour, the rate at which balance is achieved naturally is similarly much higher. Because each SSD disk can still vastly outperform a spinning disk, the effect of a short term hot SSD disk is negligible.
The latency and balance effects of spinning disks are why the general practice is to add a fresh aggregate rather than expand an existing one. Just as SSD effectively removes disk based latency effects from normal operations, it similar upends the general rule on aggregate expansion in my opinion.
The only consideration I'd make is to expand your aggregates using full raid groups. There is some overhead at the controller in managing raid groups that are not the same size (number of disks in each raid group) across the aggregate. At this scale it's likely not enough to make a difference. If the controller is not significantly loaded or driving a highly latency sensitive application, you likely won't notice a difference. But, for the highly sensitive application, a latency change from say 1.1ms to 1.2ms is an 9% change in average performance and could impact that type of application.
JohnChen I recently added SSDs to some AFF8080EXs also. 1 aggr on each node like you. I was seeing that my first raid group was more heavily utilized than the new raid group. Run statit -b then statit -e on that node to compare the aggr raid groups activity/performance Recommendation from support was to run reallocate start -A aggr_SSD(or whatever your aggr name is) from the node shell. That reallocated physically all the blocks in the aggr. Without doing each volume. It reallocates the blocks then redirects the volumes. Then finishes. I used ::>node run -node clus1-01 reallocate start -A -o aggr_SSD. The -A specifies aggregate and -o specifies run once. Otherwise it defaults to daily interval. Then same command with reallocate status to see the progress. Be cautious if your aggrs and node are over taxed. I am running 8.3.2 but think the command is the same in 9. Call support and open a quick case and they will verify.
Also check out NABox with Grafana. There's a large thread in the communities about it. Once running and collecting there is a canned disk performance page where you can see your cluster-aggregate-raid group utilization in a clocked line graph and compare all of them. https://nabox.tynsoe.org/