Network and Storage Protocols

Solution for a real-time distributed file system across data centers?

francesco_rizzo
11,956 Views

I am in need of a NetApp-based solution design that will provide a CIFS-based file share for a single volume to windows-based application servers located in two data centers separated on opposite ends of the United States.   The application run-time requirements mandate that both sets of application servers utilize a single shared set of files with as short of a replication lag as possible.   Replication will need to be bi-directional and synchronous.

The disk volume should reside in both data centers to satisfy high-availability and access latency requirements.

An initial recommendation of MS File Servers utilizing DFS was made and rejected as NetApp Filers are specified for use in the data centers.

What would the appropriate NetApp solution / architecture be?   Initial investigation into SnapMirror capabilities indicates it is a non-real-time DR solution.

Obviously, network capacity and latency concerns will be factored when provisioning the long-haul WAN.  Cisco WAAN appliances may be considered.

Any thoughts on this would be appreciated as well.

Thank you,

-Frank

9 REPLIES 9

radek_kubka
11,956 Views

Hi,

I see two separate issues in here.

Firstly, are you sure the application in question can operate on two copies of data in two locations? For vast majority of existing apps that is not the case & they have one active data plex plus one or more passive plexes (these in some cases can be used for reads only).

Secondly, whether certain form of replication can be real-time (or near-real-time) depends on the link characteristics - its bandwidth & latency. In fact NetApp SnapMirror can operate in either synchronous or asynchronous mode (not sure whether semi-sync has been already droped or not). But even if you use async mode, in theory you can replicate every minute if a) you have relatively small data change and/or b) your pipe has massive bandwidth.

Bear in mind though that SnapMirror target volumes are read-only, so changinging anything within them (without breaking the mirror) is out of the question.

Regards,
Radek

francesco_rizzo
11,956 Views

The application can not operate on two separate copies of the file-based data.  It must have one copy. Data will be accessed via CIFS share, but storage must be actively distributed across multiple filers in separate data centers in real-time.  Application servers in either data center should be able to use the CIFS share via NetApp appliance that is co-located in the same data center.  Mirroring for DR purposes does not provide the required functionality.

We have done this using DFS to provide a single CIFS share path to a distributed / clustered application while the data available behind the share was available on disks in multiple data centers.   This project mandates use of NetApp equipment.

What would the infrastructure architecture look like for the NetApp equipment to provide this functionality?

Assume for the purposes of this discussion that we have WAN provisioned to provide sufficient bandwith and low-latency connectivity.

Thanks.

david_wallis
11,956 Views

I Think the problem is whoever wrote the project breif was trying to spec the solution.

In my mind you simply cannot do this with NetApp alone - If the project breif say's you must use NetApp Filers then you are going to have do two things:

A) Use NetApp Filers

B) Put something infront of them to provide the functionallity that is required

Optional: C) Sack the solutions architect.

If you can't add additional servers then you are going to have to publish a DFS Namespace either as a standalone root on the application servers or publish it into AD if you have this available - or use other software to accomplish this.

Eitherway it doesnt matter what is written in your project breif as far as I am aware there is simply no way you can have replication occuring between two filers running ontap whilst they are both R/W - not unless you want to get involved with developing drivers / software.

David

francesco_rizzo
11,956 Views

Thanks David,

Option C has crossed our minds. 🙂

For options A and B as you listed them, forgetting any specific solution "requirements", how would we accomplish the closest realistic architecture?

"single" volume accessed:

via CIFS

using a single UNC path (DFS?)

from MS Windows 2003 servers in two data centers on opposite ends of North America (~2500 miles)

with authentication/authorization via Active Directory

where application servers from both data centers will be performing both read and write operations

and long-haul WAN uses OC-48 2.5G SONET with ~60ms latency coast-to-coast (max payload: ~2400 Mbit/s)

Keeping in mind requirements for full HA capabilities and DR considerations, preferably without outage during DR fail-over.

david_wallis
11,956 Views

DFS will potentially do what you want, I cant say I've tried DFS Replication, but you should be able to give it a try either using a couple of similulators or your real filers.

Some info here: http://aserverblog.blogspot.com/2009/08/windows-server-2008-dfs-share.html

but there's plenty of info on technet.


David

radek_kubka
11,956 Views

DFS will potentially do what you want

Well, not really - be wary of this:

http://communities.netapp.com/message/6089#6089

"The CIFS Protocol in NetApp Systems supports DFS, but only as a leaf object and not participating in any means in replicating data or DFS content"

Regards,

Radek

JeremySmith
11,956 Views

David's option B could specifically be Windows servers utilizing DFS but using NetApp filers for block storage (FC or iSCSI).  We have a very large domain based DFS structure that provides a single namespace for company-wide team shares but only one target for each leaf is R/W.  I am very skeptical you can use DFS to provide R/W for the same data at both locations without running into performance and/or save conflict issues.  DFS does provide the distribute write capability but in my experience it is best used for data is that mostly read and rarely write access.

Without knowing the application and the reason for the geographic split it's tough to get more specific about other possible solutions.

janvanderpluijm
11,956 Views

To my knowledge DFS-R only replicates closed files and not every write and also has no bidrectional replication conflict solving. Probaly integrity problems.

The direction of your solution has to be found at an higher level in te stack than storage or file system.

When using clustering and/or replication at the DBMS level there are multilpe solutions.

jimmykmtam
11,956 Views

Francesco,

Regarding your project requirements as mentioned above, Peer Software offers a solution for Business File Sharing and Collaboration on NetApp and Windows systems that fits your needs. The company's enterprise file collaboration product is named PeerLink, which includes DFSR+ technology that allows for the following key capabilities:

  • Real-time bi-directional synchronization
  • Block-level synchronization (i.e. only the changed portions of a file are replicated to save on WAN bandwidth)
  • Multi-threaded synchronization (i.e. parallel processing of multiple file events for faster updating)
  • Version conflict prevention through distributed file locking (i.e. when a file is opened on one site for WRITE access, PeerLink locks down the synchronized copy for READ ONLY access on the other sites)
  • Supports NetApp Data ONTAP v7.3.5 (and higher) as well as Microsoft Windows Server 2000, 2003, 2003 R2, 2008, and 2008 R2

A product data sheet can be found here: http://www.peersoftware.com/products/resources/datasheets/peercollaborationenterprise.pdf

A customer success story can be found here: http://www.peersoftware.com/customer/case_study/kiekert.aspx

I hope this information helps. Please let me know if you have any questions.

Jimmy Tam

VP Sales & Marketing

Peer Software Inc.

A NetApp Advantage Alliance Partner

Public