ONTAP Discussions

ndmp backup fails to complete.

dmandell
9,222 Views

FAS3050

Ontap 7.3.5.1

EMC Networker 7.6.1

Am having a terrible time trying to get NDMP backups to work from the netapp filer.

Hoping someone might have some useful suggestions.

Things that are right and or working.

1. NDMP enabler is properly licensed in EMC Networker

2. No permissions issues. NDMP connections starting OK.

3. Backup snapshot being created OK on source filer.

ndmpd options of filer are as follows:

ndmpd.access                 all       
ndmpd.authtype               challenge 
ndmpd.connectlog.enabled     off       
ndmpd.data_port_range        all       
ndmpd.enable                 on        
ndmpd.ignore_ctime.enabled   off       
ndmpd.offset_map.enable      on        
ndmpd.password_length        16        
ndmpd.preferred_interface    disable   
ndmpd.tcpnodelay.enable      off

Also tried

ndmpd.tcpnodelay.enable      on

and

ndmpd.preferred_interface    <interface name>

Have also tried

ndmpd version 1 - 4 (no difference)

What happens is that the process starts fine and always "hangs" at the same place.

Using ndmp debug, it gets to:

Log message: DUMP: mapping (Pass II)[directories]

Then hangs and finally times out with:

Apr 18 16:22:26 PDT [ndmpd:9]: Associated message valid: 0
Apr 18 16:22:26 PDT [ndmpd:9]: Associated message sequence: 0
Apr 18 16:26:42 PDT [ndmpd:9]: Message NDMP_LOG_MESSAGE sent
Apr 18 16:26:42 PDT [ndmpd:9]: Message Header:
Apr 18 16:26:42 PDT [ndmpd:9]: Sequence 18
Apr 18 16:26:42 PDT [ndmpd:9]: Timestamp 1303169202
Apr 18 16:26:42 PDT [ndmpd:9]: Msgtype 0
Apr 18 16:26:42 PDT [ndmpd:9]: Method 1539
Apr 18 16:26:42 PDT [ndmpd:9]: ReplySequence 0
Apr 18 16:26:42 PDT [ndmpd:9]: Error NDMP_NO_ERR
Apr 18 16:26:42 PDT [ndmpd:9]: Log type: 0
Apr 18 16:26:42 PDT [ndmpd:9]: Message id: 0
Apr 18 16:26:42 PDT [ndmpd:9]: Log message: DUMP: Network communication error

and ultimately:

Apr 18 16:26:43 PDT [ndmpd:9]: Associated message valid: 0
Apr 18 16:26:43 PDT [ndmpd:9]: Associated message sequence: 0
Apr 18 16:26:48 PDT [ndmpd:9]: Message: NDMP_NOTIFY_DATA_HALTED sent
Apr 18 16:26:48 PDT [ndmpd:9]: Message Header:
Apr 18 16:26:48 PDT [ndmpd:9]: Sequence 21
Apr 18 16:26:48 PDT [ndmpd:9]: Timestamp 1303169208
Apr 18 16:26:48 PDT [ndmpd:9]: Msgtype 0
Apr 18 16:26:48 PDT [ndmpd:9]: Method 1281
Apr 18 16:26:48 PDT [ndmpd:9]: ReplySequence 0
Apr 18 16:26:48 PDT [ndmpd:9]: Error NDMP_NO_ERR
Apr 18 16:26:48 PDT [ndmpd:9]: Reason: 4
Apr 18 16:26:48 PDT [ndmpd:9]: Message NDMP_LOG_MESSAGE sent
Apr 18 16:26:48 PDT [ndmpd:9]: Message Header:
Apr 18 16:26:48 PDT [ndmpd:9]: Sequence 22
Apr 18 16:26:48 PDT [ndmpd:9]: Timestamp 1303169208
Apr 18 16:26:48 PDT [ndmpd:9]: Msgtype 0
Apr 18 16:26:48 PDT [ndmpd:9]: Method 1539
Apr 18 16:26:48 PDT [ndmpd:9]: ReplySequence 0
Apr 18 16:26:48 PDT [ndmpd:9]: Error NDMP_NO_ERR
Apr 18 16:26:48 PDT [ndmpd:9]: Log type: 0
Apr 18 16:26:48 PDT [ndmpd:9]: Message id: 0
Apr 18 16:26:48 PDT [ndmpd:9]: Log message: Connection or IO Error.

     

After that it aborts the ndmpdump.

Anybody have any ideas or suggestions?

Thanks!

Dana

2 REPLIES 2

nitish
9,222 Views

Possible causes for this error are as follows:

- Network connectivity and/or name resolution issues between the NetWorker Server and NAS device - verify with nslookup
- NDMP user name or password issue, or NDMP password authentication method incorrect verify the inquire -N <NAS_hostname> output
- NDMP service is not started on the NAS device - not in this case
- NIC teaming is implemented on the NetWorker Server or multiple network interfaces are resolving to the same hostname.
     Often times the NetWorker Server or Storage Node is multi-homed with the NICs configured in a NIC teaming mode.
     This is particularly common with HP NICs. Using the appropriate NIC utility configuration utility (usually installed with the NIC vendor drivers) check the NIC teaming configuration
     and verify it is set to "Fault Tolerant Only". If the NIC teaming configuration is set to "Load Balancing", or any other mode that combines the multiple NICs
     into a single virtual interface, then the NDMP connection will be problematic and will likely fail.
     On servers with multiple NICs it is not uncommon to see the server's hostname resolving to more than one network interface IP addresses on the server
     (usually set by the system administrator in the ..\etc\hosts file). This is not a valid configuration and should be corrected.
     Assign the server's hostname in the ..\etc\hosts file to the primary network interface of the server only and assign a different hostname to remaining interfaces if required.
- Ensure the NAS and any NetWorker Storage Nodes resolve the NetWorker Server's hostname on the primary network interface IP address only.
- Firewall exists between the NetWorker Server and the NAS device and TCP ports are blocked.
     NDMP connections between the NetWorker Server and NAS are supported with some limitations. The default NDMP connection port 10000
     must always be open on the firewall to facilitate the initial NDMP connection.
     With respect to the Windows firewall feature, especially in the case of Windows 2008, the recommendation for NDMP use is to disable the Windows firewall

dmandell
9,222 Views

Nitish,

Thank you for the quick response.

Below are my comments.

Possible causes for this error are as follows:

- Network connectivity and/or name resolution issues between the NetWorker Server and NAS device - verify with nslookup

Name resolution OK


- NDMP user name or password issue, or NDMP password authentication method incorrect verify the inquire -N <NAS_hostname> output

NDMP username OK


- NDMP service is not started on the NAS device - not in this case

Service running


- NIC teaming is implemented on the NetWorker Server or multiple network interfaces are resolving to the same hostname.
     Often times the NetWorker Server or Storage Node is multi-homed with the NICs configured in a NIC teaming mode.

No NIC teaming
     This is particularly common with HP NICs. Using the appropriate NIC utility configuration utility (usually installed with the NIC vendor drivers) check the NIC teaming configuration
     and verify it is set to "Fault Tolerant Only". If the NIC teaming configuration is set to "Load Balancing", or any other mode that combines the multiple NICs
     into a single virtual interface, then the NDMP connection will be problematic and will likely fail.
     On servers with multiple NICs it is not uncommon to see the server's hostname resolving to more than one network interface IP addresses on the server
     (usually set by the system administrator in the ..\etc\hosts file). This is not a valid configuration and should be corrected.

Only one entry in etc/hosts


     Assign the server's hostname in the ..\etc\hosts file to the primary network interface of the server only and assign a different hostname to remaining interfaces if required.

No duplicate hostname


- Ensure the NAS and any NetWorker Storage Nodes resolve the NetWorker Server's hostname on the primary network interface IP address only.
- Firewall exists between the NetWorker Server and the NAS device and TCP ports are blocked.

No firewall
     NDMP connections between the NetWorker Server and NAS are supported with some limitations. The default NDMP connection port 10000
     must always be open on the firewall to facilitate the initial NDMP connection.
     With respect to the Windows firewall feature, especially in the case of Windows 2008, the recommendation for NDMP use is to disable the Windows firewall

I setup a "test" filer and tested on it with the same result.

It did work at one time on the original filer.

It then stopped with no intervention on our part. EMC is baffled, said it was a bug in Networker and told us to "upgrade to the latest version of Networker" which we did to no avail.

Again, the NDMP communication between the the server and the filer seems OK.

Pings, nslookup, ndmp username/password. all OK

BTW, ndmpcopy also works fine.

just not ndmpdump.

Public