Solved: SAS Disk Ownership Best Practices

RUANLOUWDCX · ‎2013-08-29

Hi Everyone, we have a customer who at the time decided that the best path for them regarding their Netapp storage system would be the following: 2x FAS3240 controllers and 6x DS4243 disk shelves, to complicate things a little these are all part of one stack in MP HA, what we did is manually assign disk ownership to the separate controllers (2 Aggregates) to spread the workload between the 2 controllers, as we did not want to rely and the auto disk assign just in case one controller gets stuck with all the disk (I know this will probably never occur).

So now the customer is questioning our logic and saying that the 50/50 split is not best practice as it so happens that one controller owns 50% disk on 2 shelves and the other one the remainder (Reason for this is that originally they only had 4 shelves and we added 2 more, and these were split 50/50 between the controllers. Now I realise that we would probably been better off assigning one new shelve per controller, but this unfortunately did not happen. As far as I can see there is no real issue here, besides the slightest if any performance degradation.

Does anyone know if there is in fact any Best Practices out there besides splitting the shelves 50/50 between the controllers? I know in the ideal world one would just assign full shelves to a controller, but this is not always achievable, on takeover the partner would just take over the personality of the other controller and data would continue to be served as if nothing ever went wrong.

DAVE_WITHERS · ‎2013-09-12

I can confirm wholeheartedly to your client, through having the exact same HA pair, 2 non equal stacks of shelves with disks assigned between both head units not relating back to their shelf that there is absolutely no worry during failover scenarios that all data wont be available to the Unit who is in takeover. I have failed over 2 HA cluster pairs, 30x plus with this non best practice setup.

As long as the shelves MPHA is proper, they are good to go, no matter how you slice and dice their shelves and disks between controllers. Best practices aside.

View solution in original post

rwelshman · ‎2013-09-05

I don't know if it is documented, but it would be a good practice to assign all the disks on a shelf to a single controller. Of course, best, best practice is for each to have their own stack. But we all know how things work in the real world.

What we have done in the past in scenarios where you start with one or two shelves, is to split the disk assignment between the heads. Once more shelves are added, the new disks can be re-assigned so that each controller owns all the disk in their "own shelves". You can do this via the disk replace commands (if you have enough spares, and the patience). It is also a good practice to spread the disks in the raid groups in the aggregate over all shelves that the controller is using.

So for 6 disk shelves, maybe shelves 1, 2 and 3 belong to head 1 and shelves 4, 5 and 6 belong to head 2, and then in the aggregate for head 1 the disks would be added such that the first disks from shelf 1, shelf 2 and shelf 3 are added first, then the second disks from all three shelves and so on so that IO to each raid group is spread across all shelves.

RUANLOUWDCX · ‎2013-09-05

Thanks Riley, you've confirmed our way of thinking, for some odd reason the client seems to think if a disk shelve's disks is split between two controllers one would loose the whole storage solution if that shelve goes bang, we originally had 4 shelves but when the new 2 were added it was for some reason split in halve and assigned to different controllers

rwelshman · ‎2013-09-09

If a shelf fails entirely, you're going to have problems in any scenario. If the disks in that one shelf belong to both filers, then you are going to lose access to part of the data on both at the same time. To reduce the overall risk, when possible, assign the disks in each shelf to only one filer. The shelves have mostly redundant parts, so a complete failure is low but not impossible.

DAVE_WITHERS · ‎2013-09-12

I can confirm wholeheartedly to your client, through having the exact same HA pair, 2 non equal stacks of shelves with disks assigned between both head units not relating back to their shelf that there is absolutely no worry during failover scenarios that all data wont be available to the Unit who is in takeover. I have failed over 2 HA cluster pairs, 30x plus with this non best practice setup.

As long as the shelves MPHA is proper, they are good to go, no matter how you slice and dice their shelves and disks between controllers. Best practices aside.

RUANLOUWDCX · ‎2013-09-12

Thanks for your reply Dave, I think I just wanted someone to confirm what we tried to explain to the customer, they tend to be a little paranoid from time to time. Does’nt help that they get 3rd party people in to play devil’s advocate every now and again.

Really appreciate it.