ONTAP Discussions

vFiler migrate fails with "Communication to remote filer failed"

aptare
4,146 Views

Hi,

I’m evaluating a vFiler migrate but have been unsuccessful so far.

I have two 7.3.5 simulators named acefiler3 and acefiler4 respectively

I’ve licensed Multistore on both filers. I have SnapMirror, SnapVault, HTTP and other licenses installed. All other settings and options are similar, including httpd.admin.enable and httpd.enable.

I create a vFiler on acefiler3 called acevfiler3. I create a Dataset on this vFiler and create a snapshot on it.

I can ping acefiler3 and acevfiler3 from acefiler4 and acefiler4 from acefiler3 and acevfiler3. They are all running as VMs on the same ESX server so no firewalls in play.

I then try to migrate the vFiler acevfiler3 with the Hosts/vFiler Units Start migration button in the NetApp Management Console (3.0.1) The DFM version is 4.0.1 .

I select acefiler4 as the destination system

When I finish the wizard I get the following job messages:

Job Started
Successfully provisioned flexible volume ‘acefiler4:/acevfiler3_root’(534) of size 50.0MB with space guarantee set to ‘volume’ on aggregate ‘acefiler4:aggr2’(305).
Successfully provisioned flexible volume ‘acefiler4:/DatasetVthree’(537) of size 62.5MB with space guarantee set to ‘volume’ on aggregate ‘acefiler4:aggr1’(307).
An error occurred: acefiler4.corp: Communication to remote filer failed
Destroyed the provisioned volume ‘acefiler4:/acevfiler3_root’(534) on ‘acefiler4.corp’(304).
Destroyed the provisioned volume ‘acefiler4:/DatasetVthree’(537) on ‘acefiler4.corp’(304).
Job completed with errors

The filer syslogs do not say much.

Can anyone assist me  or gve me pointers to try and trouble shoot this error?

1 ACCEPTED SOLUTION

sinhaa
4,146 Views

Hello and welcome to the communities.  You have not mentioned you are using Online or Offline migration, but looking at the size of the volume I can say it was Offline. This error "Communication to remote filer failed" is a commonly seen error in our test environment but I'm seeing it the first time coming from a customer.

I believe this job is a migrate start job and it failes almost quickly.

The common reasons for this maybe one of the following:

1. your source and destination filers are unable to ping each other using the FQDN. Pinging using the IP isn't enough. Filers must be reachable to each other in both directions i.e. source --> destination and also destination --> source using FQDN. Though this both direction requirement is mainly for Online migration, we can do this to be on safe side.

If the filers are not pinging each other, try to setup the DNS entries properly. Or alternately write the information about the filers in /etc/hosts files of both source and destination. If you need information on how to do this, we can continue on this point in later posts.

2. Possible but to lesser extent that your options snapmirror on both filers aren't set right. Do you see any console error message on the destination filer when you do migrate start and this error appears. Can you give the output of the command "options snapmirror" from both filers?

warm regards,

Abhishek

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

View solution in original post

4 REPLIES 4

sinhaa
4,147 Views

Hello and welcome to the communities.  You have not mentioned you are using Online or Offline migration, but looking at the size of the volume I can say it was Offline. This error "Communication to remote filer failed" is a commonly seen error in our test environment but I'm seeing it the first time coming from a customer.

I believe this job is a migrate start job and it failes almost quickly.

The common reasons for this maybe one of the following:

1. your source and destination filers are unable to ping each other using the FQDN. Pinging using the IP isn't enough. Filers must be reachable to each other in both directions i.e. source --> destination and also destination --> source using FQDN. Though this both direction requirement is mainly for Online migration, we can do this to be on safe side.

If the filers are not pinging each other, try to setup the DNS entries properly. Or alternately write the information about the filers in /etc/hosts files of both source and destination. If you need information on how to do this, we can continue on this point in later posts.

2. Possible but to lesser extent that your options snapmirror on both filers aren't set right. Do you see any console error message on the destination filer when you do migrate start and this error appears. Can you give the output of the command "options snapmirror" from both filers?

warm regards,

Abhishek

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

kirana
4,146 Views

hi.

I have one more possible troubleshooting step as advise.

STEP 3:

If SSH is set as login protocol(DFM-filer login) for the online migration 'source' filer, then vfiler-migrate-start procedure would use SSH as communication protocol between destination and source filers. This requires 'secureadmin setup ssl' to be neccesarily run on both src, dest filers. You can perform this and retry online migration.

Else, you can set rsh as login protocol for the source filer, and retry the online migration.

aptare
4,146 Views

Hi, thanks for the two replies (Got tied up last week on another problem).

I did have some inconsistencies (the vFiler name was not in DNS and the vFiler snapmirror options were not enabled) I have fixed these but get the same error as before.

My two filers and the vFiler can see each other with ping on the FQDN (although the domain name is just on our test network):

acefiler3> ping acefiler4.corp

acefiler4.corp is alive

acefiler3> ping acevfiler2.corp

acevfiler2.corp is alive

acefiler3> vfiler run acevfiler2 ping acefiler4.corp    

===== acevfiler2

acefiler4.corp is alive

acefiler4> ping acefiler3.corp

acefiler3.corp is alive

acefiler4> ping acevfiler2.corp

acevfiler2.corp is alive

SnapMirror options are consistent:

acefiler3> options snapmirror

snapmirror.access            *         

snapmirror.checkip.enable    off       

snapmirror.delayed_acks.enable on        

snapmirror.enable            on        

snapmirror.log.enable        on        

snapmirror.vbn_log_enable    off

acefiler3> vfiler run acevfiler2 options snapmirror         

===== acevfiler2

snapmirror.access            *         

snapmirror.checkip.enable    off       

snapmirror.enable            on        

      

acefiler4> options snapmirror

snapmirror.access            *         

snapmirror.checkip.enable    off       

snapmirror.delayed_acks.enable on        

snapmirror.enable            on        

snapmirror.log.enable        on        

snapmirror.vbn_log_enable    off       

ssh and ssl is enabled:

acefiler3> secureadmin status

ssh2          - active

ssh1          - inactive

ssl          - active

acefiler4> secureadmin status

ssh2          - active

ssh1          - inactive

ssl          - active

I also enabled ssh on the vFiler - I do not see an option for ssl:

acefiler3> vfiler run acevfiler2 secureadmin status   

===== acevfiler2

ssh2          - active

ssh1          - active

acefiler3> vfiler run acevfiler2 secureadmin setup ssl

===== acevfiler2

Usage:

          secureadmin setup [-f] [-q] ssh

          secureadmin enable  all|ssh|ssh1|ssh2

          secureadmin disable all|ssh|ssh1|ssh2

          secureadmin status

After an offline vFiler migration I do get some errors on the acefiler4 console

acefiler4> acefiler3.corp: premature EOF

: Undefined error: 0

Sun Jul 24 21:36:43 PDT [wafl.vvol.offline:info]: Volume 'acevfiler2_root' has been set temporarily offline

Sun Jul 24 21:36:44 PDT [wafl.vvol.destroyed:info]: Volume acevfiler2_root destroyed.

I also tried an online migration but I do not have a Synchronous Snapmirror license on acefiler3.corp

C

aptare
4,146 Views

Maybe I'll do the hardcoding in /etc/hosts - I look on the forum for info about this

Yes, added filer name and FQDN names to both hosts files through Filer View and all works as expected. Thanks for your help

Public