ONTAP Discussions

vFiler online migration not available between storage system and its partner?

fletch2007
5,419 Views

Hi - I get this when attempting a migration between two heads in the cluster:


Conformance Results

=== SEVERITY ===
Error:    Online Migration  is not allowed between  a storage system and its  partner.
=== REASON ===
Migration not supported (Error 23509)
=== SUGGESTION ===
No suggested corrective action is available.

The Offline is available, but "data will be unavailable during cutover"

Q1: Why is online not available for intra-head migrations? is it arp cache related?

Q2: offline may work for this vfiler if the cutover (~1-2 minutes in the online case) operates the same (goes into semi-sync, then cuts over) - what will the NFS clients expect to see ?  a timeout for the 1-2 minutes?

thanks

9 REPLIES 9

adaikkap
5,419 Views

Hi,

Q1: Why is online not available for intra-head migrations? is it arp cache related?

If I remember correctly its due the limitation of semi-sync snapmirror which is not allowed between cluster pairs.

Q2: offline may work for this vfiler if the cutover (~1-2 minutes in the online case) operates the same (goes into semi-sync, then cuts over) - what will the NFS clients expect to see ?  a timeout for the 1-2 minutes?

Offline Vfiler migration only use asynchronous snapmirror and not semi-sync snapmirror.

Regards

adai

scottgelb
5,419 Views

Review the restrictions and considerations in TR3814.  Data Motion is not supported between cluster pairs.

You can however use vfiler migrate, but it doen't give the guaranteed 120 second failover...although often it will finish in that time if the mirrors are up to date... there is also snapmover "vfiler migrate -m nocopy" which uses disk reassignment between cluster pairs (or vSeries neighborhoods).  The requirement for snapmover is that the vfiler owns every volume in the aggregates it uses, then that is an option to migrate between partner nodes without copying data.  So if there is no mix of volumes from other vFilers in the aggregates used by the vfiler, look at using snapmover.

fletch2007
5,419 Views

thanks for the replies - FYI - I tested a offline migration via Provisioning manager - 5/5 times it failed at cutover with "Completed with errors"

I tried from the command line and it worked the first time:

Data transfer for vfiler storage units initiated.
This can be a long-running operation.
Stopping remote vfiler....
Updating vfiler storage units....
Starting snapmirror update commands. It
could take a very long time when the source or
destination filers are involved in many
simultaneous transfers. The console will not be
available until all update commands are
started successfully. Please use the
"snapmirror status" command on the source
filer to monitor the progress.

Mon Nov 29 14:19:10 PST [irt-na02: vFiler.etcStorage.readOnly:warning]: vFiler vm65-vf-01: vFiler cannot be started because the volume containing its root, volume vm65w, is read-only.

Waiting for "vm65w" to become stable.
Waiting for "vm65w" to become stable.
Waiting for "vm65w" to become stable.
Mon Nov 29 14:20:26 PST [vm65-vf-01@irt-na02: vf.started:debug]: vfiler: 'vm65-vf-01' started

Vfiler vm65-vf-01 migrated.
Mon Nov 29 14:20:30 PST [irt-na02: kern.cli.cmd:debug]: Command line input: the command is 'ifconfig'. The full command line is 'ifconfig irt-na02-vif-65 alias 171.65.65.221 netmask 255.255.255.0'.
Mon Nov 29 14:20:30 PST [irt-na02: cmds.vf.migrate.complete:info]: vFiler unit: 'vm65-vf-01' migrated from remote storage system: 'irt-na01'.
ir
irt-na02> vfiler status
vfiler0                          running
vm65-vf-01                       running

thanks

adaikkap
5,419 Views

Can you paste the output of the vFiler offline migrate job?

So that we can see why or what error it’s throwing?

Regards

adai

sinhaa
5,419 Views

Hello fletch2007,

       What is your DFM version? AFAIR there was once a discussion on offline cutover failing between the storage system and its partner. Wondering what is the error message. Please provide the information on job details and DFM version.

If you want to do online migration between storage system and its partner, disable the cf on the storage system using command 'cf disable', run a round of dfm host discovery ( or wait for the monitors to discover this change) and then try online migration. It should pass.

I hope this helps.

warm regards,

Abhishek

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

adaikkap
5,419 Views

If you are not ok with disabling the CF for the whole duration of Migrate start(Baseline Completeion), you can enable it after migrate start and disable it again before cutover to make it happen.

But for dfm to detect the change it would take atleast 5 min as the Clusterfailover monitor runs every 5mins.

If you wish not to wait, you can do a forceful discovery as suggested by abishek using the cli dfm host discover.

Regards

adai

scottgelb
5,419 Views

This is a good workaround to know... but is it supported?  All documentation states no data motion between cluster pairs.  I like hacks ... however if there is no official support, it would be important to mention "use at your own risk" and that the GSC will respond that all documentation this is not supported at this time.  But if it is supported to disable the cluster for data motion that would be good to know too.

tanmoy
5,419 Views

CF is a high availability feature and in case of policy based provisioning

or in automated provisioning it might result into breach of SLA if we

disable CF.

The workaround in this case might work but it¹s not documented in the

Datamotion guides to do so. The lifecycle of datamotion includes cutover,

rollback, retry-cutover and then clean-up etc; if customer disables CF for

doing all these and one of the controller head goes down during one of these

then no way he/she will be able to access the storage which will defeat the

purpose of migration also.

Offline migration can be solution for such cases!

Thanks

Tanmoy

rweeks_1
5,419 Views

Disabling your HA pair is not supported.  We do not recommend this method as it leaves your HA pair in a state which does not protect your environment.

Public