Re: Role swap to DR

chriskranz · ‎2009-06-05

I'm proposing a lot of solutions now with NetApp and VMware, and most of them include replication based on SnapMirror to a DR location. What I'm trying to get my head around now is how a customer would fail over to DR. I understand how easy and simple Site Recovery Manager and SnapMirror can make the failover over process of the actually Virtual Infrastructure, but then how do users get access to this? How do you go about seamlessly re-directing users to a totally different site?

I guess this has several scenarios...

The datacentre is down, but the users are still in the primary office.

The primary site is down and applications are only serving to remote users.

The IT staff need to test the DR situation (probably the easiest).

For this I'm ignoring CIFS and primarily focusing simply on brining up DR Virtual Machines in the state they were at the Primary location.

Does anyone have any real-world experience, or even some white board ideas on how this is achieved? Of course happy to look at complimentary solutions (possible F5's BigIP or Virtual DNS, or some fancy Cisco routing protocols), but I don't have too much indepth visibility of how these work.

Cheers!

alapati · ‎2009-06-05

Hi,

While SnapMirror will only failover the data at the storage array level, you will need other infrastructure to redirect/failover users to the DR site. In my previous life at NetApp IT, we have implemented Symantec/Veritas Cluster Server (VCS) with global cluster option (GCO) that monitor the DR site. VCS initiates site failover. Since VCS has hooks for various built-in and custom agents, applications, databases, DNS changes and storage failover can all be automated with a click of a button. Hope this helps.

Srinath

mwalters · ‎2009-06-07

Hi Chris,

I'm a little confused 'cos you say you are ignoring CIFS but your scenarios refer to users access......"How do you go about seamlessly re-directing users to a totally different site?".

If you are talking CIFS, then clearly your failover would need to be either to a same-subnet network or using some DFS-style referral to get enable users to access the data with minimal intervention.

If you mean for application servers, then again I would think it boils down to whether there is the same networking (in which case users would access those servers relatively seamlessly, regardless of their location !), or if not, then the access to the new server locations is needed.

I'm not sure this is any different for VMware than non-VMware: it ought to be easier, since your app servers ought to maintain a lot more of their integrity than if it were some manual appserver failover. As Srinath suggests, VCS is one option to use in this environment to failover.

cheers

Mike

chriskranz · ‎2009-06-07

Hi Mike, cheers for the reply,

By user access, I purely mean access to the applications. The network piece is the bit I am unsure of. The servers (as you say, whether physical or virtual) would come up on the same network as before, but I have to get my users to see this network. If my users are access this externally from home, then how do I re-direct their external access. If they are accessing from a workplace, then this would be fairly straight forward with some routing changes I would imagine. How would I plan for redirecting incomming email (multiple MX records should be simple enough), and other external facing applications?

I know this is a bit all-encompassing, I'm not really looking necessarily for a defining answer to cover all bases, just ideas on how you would go about this and how you would tackle different problems.

radek_kubka · ‎2009-06-08

Chris,

I'd say it is mostly around networking, so out of scope from a storage point of view

There are two main scenarios:

1) VLANs span across two locations, hence when you bring your VMs up at DR location they carry the same IP addresses

2) Different subnets at Primary & DR - VMs have to change their IP addresses to reflect DR subnet range.

First one is easy - as long as users have access over the network to the DR site, no further changes are required.

Second one requires DNS updates to point end clients to e.g. new IP adress of Exchange server.

In essence when VMware SRM is in use, no other form of remote clustering is required as they both have the same goal - shut (if needed) primary instance & bring on-line the secondary one with possibly different IP address.

The beauty of SRM is that only single set of VMs has to be maintained, whilst 'classic' clustering requires passive & active instances.

Regards,
Radek