First though about stretch metrocluster

Hi there,


i have to configure a stretch metrocluster, but i do not find much litterature about it... I read many things about cDOT, and few things about metrocluster.

I'm used to configure 7-mode 20xx/22xx/25xx for NFS with ESX only.


My setup :

 - 2 rooms, each with one FAS8020 (one controller, second is empty, e0e and e0f only network interface, 1 addon card with 4xSAS and 1 addon card with 2 optical ports for cluster interconnect) and one DS2246.

 - SAS/optical are used to connect (see attached file, given by Netapp)

 - a big optical cable (48 wires!) with patch panel to connect all of that directly (<100m) 

 - 2xHP 2530 switches (GbE) per rooms for management and data network


About what i read, here is what i understood (but i'm not sure !) :

 -1- i should first plug and link all

 -2- then use "netapp systemsetup" to configure each node

 -3- mirror system and data partitions (i do not know how : system oncommand GUI ?)

 -4- then use CLI to activate metrocluster

 -5- for system i should not use root partitionning but use dedicated disks (because i'll have to mirror system and data partitions)


There is also some remaining questions :

 -1- into systemsetup should i consider this as a "single node" or "two-node switchless" installation ?

 -2- why is there an addon card with two optical ports i could have used e0a and e0b (with SFP) as usual for cluster interconnect, no ?

 -3- about network what should i do ? I imagine something like that, can you correct me :

        * node 1 : LIF1 on e0e to datastore 1, LIF2 on e0f to datastore 2 ; each LIF failing on the other

        * node 2 : same thing

      -> but there i do not know if LIF should only fail on the same node or also on the other (there are options to change the behaviour). I guess reply is "other" because if il loose one room...

      -> also, datastore 1 & 2 are mirrored between nodes then how to handle that from ESX perspective ?

           I guess i should not access same datastore from different node ?

           I thing something like : to access datastore 1, i use LIF1 on node 1, and datastore 2 from LIF2 on node 2 ? Then depending if ESX are on room 1 or 2 they will go accross all the metrocluster, but this should perform well... is there a best way ?


 -4- can i use only 2 disks for system ?

 -5- can i use raid_dp with no spare (i read raid 4 with spare is not usable anymore) to avoid wasting space



Please give me your thought about that, i hope this won't be that i totaly misunderstood how a strech metrocluster works Smiley Happy

Also if you have links to read, no problem, i'll take it !


Many thanks in advance !

Re: First though about stretch metrocluster

Stretched metrocluster consists of two single node clusters. Card with SFPs is FC-VI and is required for NVRAM mirroring. Other than that you can by and large consider it very similar to 7-Mode - you have mirrored aggregates that are taken over by partner. So network design should be pretty much the same. The main difference is FCP where multipathing is limited to one node only for obvious reasons.

Standard Data ONTAP documentation offers pretty detailed step by step description how to cable and setup stretch metrocluster. Did you read it? What is missing there?

Re: First though about stretch metrocluster

Hi aborzenkov and thank for this reply,


I read many docs about different subjet, but did not read... the Standard Data ONTAP documentation !

I thought this would be too general !


I found really good docs about how cabling (and why doing this way, what is a port pair, ...), but nothing about setup. 


I'll go ahead and find and read standard doc


But i'm still confused about networks/mirrored aggr that i never used.


Thanks for that

Re: First though about stretch metrocluster

[ Edited ]

MetroCluster Installation and Configuration Guide contains detailed steps to do it, as well as references to other resources. There are also Express Guides and MetroCluster Management and Disaster Recovery Guide. Again - if something is missing there or not clear, it is better to ask specific questions.


I thought you had experience with 7-Mode MetroCluster, may be I misunderstood it. In a nutshell - you have two single node clusters that connect to the same set of disk shelves. Disks are mirrored between two locations on aggregate level (SyncMirror) - each aggregate consists of two plexes, each plex has disks from the same location. This ensures that even if one location is completely lost, data is still available. SVM configuration is replicated between two clusters; when failover happens, the SVMs are started on partner cluster using the same disks from failed node (may be only half of them if location is not available).


In normal operations each cluster (i.e. single node in your case) works independently, using own disks. So you need to only be concerned about standard configuration tasks - MetroCluster is mostly transparent here.


Network configuration should be symmetrical - both nodes must have access to the same network resources (VLAN, subnets etc) and ports must be configured accordingly. When SVM configuration is replicated, matching LIFs are created that are (attempted to be) connected to the same network objects.

Re: First though about stretch metrocluster

[ Edited ]



in fact i only configured 20xx/22xx/25xx/ on 7-mode, no metrocluster!


i just start my new config it, and a good surprised : it cames quite preconfigured !


I'll only have to configure aggr, volume, export and network.


For the time being, one thing is worrying me : e0e and e0f (my only network interfaces) are used for intercluster network... Will i be able to use e0e and e0f as data network as well ? Is intercluster using much bandwith ?



Re: First though about stretch metrocluster

Yes, cluster peering can share ports with data LIFs. It is used to replicate SVM config, so I do not expect very high bandwidth demand here after initial configuration.

Re: First though about stretch metrocluster

if you check your performance monitor system during the 1am to 5am timeframe are you seeing spikes in latency on your new metrocluster?  This would be when the default Raid.Scrub.Schedule should be running (assuming you are running 8.3 OnTap).