ONTAP Discussions

NDMP Backup and Restore Issue

Khan2022

Dear Members,

We recently implement NDMP backup in our environment on Veritas Netbackup 8.2 and we are having NetApp 9.3 CDOT NDMP backup is running but has performance issue. Sometime the NDMP job get hanged and keep running for more than 12 hours, we have to kill the job manually and again start, then it work fine.

 

NDMP Restore is not working at all and getting following errors:

 

Jan 6, 2022 3:55:54 PM - Error ndmpagent (pid=19573) hogarage: RESTORE: Error: Input Error
Jan 6, 2022 3:55:54 PM - Info ndmpagent (pid=19573) hogarage: RESTORE: RESTORE IS ABORTED
Jan 6, 2022 3:55:54 PM - Error ndmpagent (pid=19573) hogarage: DATA: Operation terminated: EVENT: I/O ERROR (for NDMP_TEMP_Restore_Test)

 

and sometime get below error:

It might occur because of having port blocked

 

Jan 10, 2022 11:55:20 AM - Error ndmpagent (pid=22637) ndmp_data_connect_v3 failed, status = -1 (-1)

Jan 10, 2022 11:55:20 AM - Error ndmpagent (pid=22637) NDMP restore failed from path /HOSVM/HOSVM_MW_WCCAPPLVP_domain
Jan 10, 2022 11:55:21 AM - Info ndmpagent (pid=22637) done. status: 25
Jan 10, 2022 11:55:21 AM - Info bptm (pid=22638) EXITING with status 5 <----------

 

Can anybody please share the experience or any advise...that would be highly appreciated.

 

Thanks

1 ACCEPTED SOLUTION

Ontapforrum

Could you please check if there is a firewall (ports) in place for 'inbound' connections to Media Server?

 

For Remote NDMP: (Restore process)
1) 1st - Control connection is initiated from Media Server to ---> Filer @Port:10000 (outbound)
2) 2nd - Next, Data connection is initiated from Filer to Media Server ----> Inbound @Port: Random

View solution in original post

12 REPLIES 12

TMACMD

Many years ago I had an issue and the prices ended at 12 hours! 

in my casei was doing a backup and indexing process was capped by the application at 12 hours. 

not saying that is your problem exactly what I had since I was writing and you are reading but maybe it may be part of the reading of the indexing? Does the index spam tapes? There should be more logging on the NetBackup side. If not turn it up and try again to see if you can better approximate where the error is occurring. In other words, it should log if it is loading tapes, reading tapes, etc. 

 

it’s also possible you have a physical media error

 

 

Khan2022

Hello TMACMD

Thanks for your sharing comments by your side.

 

Actually, we are using disk based media server, we are not using tapes for using NDMP backup jobs

Ontapforrum

Also, do you mind sharing the full log (ndmp) from netbackup ? and From NetApp:/etc/log/mlog/ndmpd.log

Is it remote/local/3-way NDMP backup/restore setup ?

 

While googling, I found a Netbackup KB article from Veritas which is perhaps what you are experiencing for the other error..

 

NDMP restore fails with exit status 5, 25 "abort_on_listen_connect_failure"
https://www.veritas.com/support/en_US/article.100032547

 

Once NDMP control connection is setup between Filer & Media server, it initiates Data connection back to Media Server (IP: Port), this Port can be seen in the log, and need to be allowed through firewall.

 

For I/O error, we need to see NDMP logs from netapp.

 

Also, I would raise a ticket with Veritas & NetApp to investigate performance issues.

Khan2022

Dear Ontapforrum,

 

Many thanks for your advise and commenting on my post.

Kindly find attached one of the NDMP log taken while running NDMP restore job.

I believe it is Remote NDMP Backup

 

Ontapforrum

Where is this IP?
tcp_addr[0].tcp_addr=172.16.2.230, port=2724

 

Could you share the backup (NAS/ndmp) logs from Netbackup.

Khan2022

this is the IP address of our media sever which is assigned as NDMP Host

The one log i  shared before is from Netbackup, since I don't have access on Filer so I wouldn't be able to get ndmpd.log

 

I have attached one more log file taken from NDMP log of media server

Ontapforrum

Ok, I thought so, b'cos you mentioned remote NDMP, so it must be Media Server. 

 

NDMP logs from FILER is key. Could you not reach out to someone from Storage team who can give you the logs from the FILER. 

 

I looked both logs: There is no error reported from the Media Server logs, I can see data pipe is setup ok, read offset & length is successfully read but after that there isn't much.

 

 

 

Also, what Filer/OS version ?

Khan2022

Kindly check below job details showing error we are getting at the time we try to run NDMP restore job.

 

JOB DETAILS:

Jan 17, 2022 2:20:32 PM - Info ndmpagent (pid=20475) INF - Restoring NDMP files from /HOSVM/HOSVM_MW_WCCAPPLVP_domain/ to [See line below]
Jan 17, 2022 2:20:32 PM - Info ndmpagent (pid=20475) INF - Restoring NDMP files from [See line above] to /HOSVM/NDMP_TEMP_Restore_Test
Jan 17, 2022 2:20:32 PM - Info ndmpagent (pid=20475) NDMP Remote disk
Jan 17, 2022 2:20:34 PM - Info ndmpagent (pid=20475) This is CDOT restore
Jan 17, 2022 2:20:34 PM - Info ndmpagent (pid=20475) DAR disabled - continuing restore without DAR
Jan 17, 2022 2:20:34 PM - Info ndmpagent (pid=20475) Attempting normal restore.
Jan 17, 2022 2:20:35 PM - Info ndmpagent (pid=20475) hogarage: Session identifier for Restore : 60878
Jan 17, 2022 2:30:36 PM - Error ndmpagent (pid=20475) hogarage: RESTORE: Error: Input Error
Jan 17, 2022 2:30:36 PM - Info ndmpagent (pid=20475) hogarage: RESTORE: RESTORE IS ABORTED
Jan 17, 2022 2:30:36 PM - Error ndmpagent (pid=20475) hogarage: DATA: Operation terminated: EVENT: I/O ERROR (for NDMP_TEMP_Restore_Test)
Jan 17, 2022 2:30:37 PM - Error ndmpagent (pid=20475) NDMP restore failed from path /HOSVM/HOSVM_MW_WCCAPPLVP_domain
Jan 17, 2022 2:30:37 PM - Info ndmpagent (pid=20475) done. status: 5
Jan 17, 2022 2:30:37 PM - end reading; read time: 0:10:06

Khan2022

The Filer OS version is ONTAP 9.3

 

Ontapforrum

Could you please check if there is a firewall (ports) in place for 'inbound' connections to Media Server?

 

For Remote NDMP: (Restore process)
1) 1st - Control connection is initiated from Media Server to ---> Filer @Port:10000 (outbound)
2) 2nd - Next, Data connection is initiated from Filer to Media Server ----> Inbound @Port: Random

Khan2022

I believe the ports are open from Media Server to ---> Filer @Port:10000 (outbound)

And what we request before to open the ports bi-bidirectionally from both sides.

 

So the root cause of getting this I/O Error could be due to port blocked or it could be due to some other reason.

 

Jan 17, 2022 2:30:36 PM - Error ndmpagent (pid=20475) hogarage: DATA: Operation terminated: EVENT: I/O ERROR

 

our storage admin have already logged a case with NetApp support but until now we didn't get any response from them

 

Much appreciated for sharing your experience and useful comments

 

Khan2022

Many thanks, Ontapforrum for your kind advise.

It was found that ports were blocked from NetApp filer to Media server due to which restore process was failing.

 

There was a workaround to just open ports 65000-65009 from Firwall worked in our environment.

God bless you and all other members for their comments

Thanks

Public