A user asked this question in response to Glenn's awesome introduction video, and I thought it was interesting enough for it's own conversation. Well that and you can't format comments in response to a video!
So how does OCSR prevent split-brain scenarios?
The short answer is that OCSR will only perform a failover when it detects a disaster, and the node it is running on is part of the windows cluster quorum.
For a more detailed answer let's take a look at a couple scenarios in the context of a sample configuration that is pretty typical of how we anticipate OCSR being deployed:
- 2 Sites
- 2 Windows Nodes per Site
- 1 File Share Witness (FSW) at a third site
- 1 MetroCluster controller at each site
Cluster quorum is achieved through votes. Each node gets one vote and the witness gets one vote. If a group of nodes is separated, the group must have N/2+1 votes (in this example 3 votes) to meet quorum or the cluster services are stopped.
So if site A loses connectivity, it will only have 2 votes, and the cluster applications will be shut down. Site B will still have access to the FSW and have 3 votes so it stays up. In that case OCSR can safely failover storage site B allowing WSFC to bring the failed applications online.
If both site A & site B both lose connectivity to each other and the FSW, BOTH sites will lose quorum and the cluster will shut down completely. In that case an administrator can manually "force quorum" at site A and bring the clustered services up and allowing OCSR to failover storage to site A.
This blog post has some good details on quroum models. Keep in mind that for multi-site clusters only "Node Majority" and "Node and File Share Majority" are supported by Microsoft (and thus OCSR).