2010-10-18 04:02 PM
Hi, this one requires a little setup - I hope its clear:
we are upgrading 2 campus area Netapp 3040 clusters to 3170s running 7.3.3
The snapmirror destination has already been upgraded to the 3170.
We run all NFS served by vFilers. Clients are VMware, Oracle, NFS logs, apache web content etc
Our snapmirrors are all setup with the vfiler dr configure command line syntax.
Our plan was to upgrade the remaining production 3040 cluster to the 3170 was to initiate a vfiler failover as we have practiced and documented several times in the past. (on 7.2.x)
1) Suspend VMware VMs, shutdown Oracle, tomcats, apache etc
2) vfiler stop (on 3040)
3) snapmirror update for all volumes
4) vfiler dr activate for all vfilers to "promote" the 3170 from snapmirror destination to production
5) re-animate VMware VMs, restart Oracle, tomcats, apache etc
5) upgrade the 3040 to 3170
6) re-establish DR vfilers and snapmirrors
What we found when we did the steps we had documented from 7.2.x days (a small scale test with a small volume encapsulated in a test vfiler) was the IP failover did not work as expected - we received duplicate IPs messages and soon after the VMs in the test vfiler crashed when the ESX host got confused about the NFS datastore on the IP level due to duplicate IPs.
We have had an open case with Netapp on this for a few weeks now without much progress.
Last week we learned about the vFiler migration functions automated by Provisioning manager (PM).
I tested them today and they worked flawlessly in offline and even online with a running VM - uninterrupted while the vfiler (NFS datastore) failover happened - I was particularly impressed to see the "converting to semi-synchronous snapmirror" messages.
The IP failover seemed to be handled without the VMs or ESX boxes even logging a single timeout warning.
However the PM vFiler migration solution while seemingly free of duplicate IP issues, does not perfectly suit our vFiler failover goals since:
1) our vfiler and snapmirror relationships are already established and initialized. In fact to re-initialized our terabytes of snapmirrors from scratch would take days - we want to use snapmirror update, not initialize
2) we want to end up not with the old vfiler being left in "needs cleanup" mode as the PM vfiler migration does, but we want to re-establish the snapmirrors in reverse from where we left them off (avoiding lengthy initialization times)
So my question is:
Does PM provide a method to discover existing vfiler DR relationships and provide administratively initiated failover automation ?
Failing that, could we get the "script" PM is using to automate the vFiler migration and modify it for our needs?
(PM already provides an input for a user customized script for vFiler migration)
thanks for any feedback,
2010-10-18 04:13 PM
Provisioning Manager does not work with vFiler DR unfortunately. I have also run into the duplicate IP issue. The workaround is to ifconfig 0.0.0.0 down on the interface that was bound to the source vfiler prior to vfiler dr activate on the target. For some reason as of 7.3.3+ the source system keeps the IP broadcasting even after the vfiler is brought down (I submitted a request to file a burt but haven't seen a burt filing yet). Another one I found is on vfiler migrate complete, the snapmirror protocol is stopped on the source physical controller which might not be good if you have other mirrors running.
Provisioning Manager can be used to create a new vfiler and replicate all the volumes in the source vfiler to the target systme, but not using the vfiler dr method.
I don't know of any way to get around initializing the mirrors for data motion or vfiler migrate (no method to set it up without an initialization). For vfiler dr, you can manually create the target vfiler (if all mirrors already exist) then vfiler dr resync to get around having to re-init all the mirrors.
2010-10-18 08:20 PM
As I understand the requirement is to reuse existing snapmirror relationships for vfiler migrate. If so, there is no way to achieve this in 4.0. No workarounds also.
ProvMgr needs to create all snapmirror relationships for vfiler migration (both online/offline).
If this is a one off case, you have to perform initialize again. Unfortunately there is no workaround.
If there are other usecases to this requirements, please work with product mgmt to get this requirement into a future release.
2010-10-19 06:44 AM
Would Netapp be able to share the "script" behind PM vFiler migration so we can modify it for our needs?
(eg take out the initialization step)
2010-10-26 02:14 PM
With the goal of avoiding the long snapmirror initialization I was playing around with importing an existing vfiler -> DR vfiler relationship and managed to get one imported.
What’s not clear, however is if the failover offered under the protection manager – (esp with the “if a final update from primary to secondary storage is necessary...” language) will do the same semi-synchronous snapmirror and IP failover as seemlessly as vFiler migration does.
2010-10-26 05:34 PM
I believe you are referring to Protection Manager's external relationship page (where you import relationships into datasets with protection policy). I believe you are importing a relationship to dataset with DR based protection policy and trying to failover (under the diaster reocevry tab) the dataset.
I'm afraid it won't work. The Failover button you see would simply break the VSM or QSM relationship (and make the destination writable). It has got nothing to do with vFiler DR failover or vFiler migrate cutover.
I'm really sorry to be a party pooper
Thanks and regards
2010-10-26 07:52 PM
If I understand correctly you want to reuse the existing VfilerDR that you have created to do a Datamotion without re-initialization.
this is not possible today as said by nagender in the previous post.