Are you putting each stack in a separate rack? If not, then you could have downtime later to move shelves so you can add more disks... I like the idea of 4 stacks, but if 2 stacks saves you from having to move shelves later, then it is a better option since you can add stacks 3 and 4 later as new stacks.. but if you won't exceed your current rackspace, then 4 do make sense. Do a cable audit since you might need some additional cables of different lengths than the ones that shipped if the shipment assumed 2 stacks. With MPHA, you get 24GB/sec to the back of each loop with 2 paths of 4x3GB lanes...so bandwidth to the shelves shouldn't be a bottleneck.
I'd recommend number each stack with room to grow. So the first stack has disk shelves 11,12,13,14, the second stack 21,22,23,24, the third stack 3x, forth 4x. This'll make it really easy to identify and also really easy to expand without too much forward planning.
Stretching a stack into a different rack isn't too much of an issue, you can get 5m SAS cables I believe and this is more than enough to stretch to the next rack if needed.
I second the 4x4 approach. As you say, the best back-end performance. Additionally it's the highest level of resiliency as you have to lose more loops to make something catastrophic happen! Just make sure you split all the SAS connectivity across multiple IO boards (PCIe or onboard).