ONTAP Discussions
ONTAP Discussions
Dear Members,
We recently implement NDMP backup in our environment on Veritas Netbackup 8.2 and we are having NetApp 9.3 CDOT NDMP backup is running but has performance issue. Sometime the NDMP job get hanged and keep running for more than 12 hours, we have to kill the job manually and again start, then it work fine.
NDMP Restore is not working at all and getting following errors:
Jan 6, 2022 3:55:54 PM - Error ndmpagent (pid=19573) hogarage: RESTORE: Error: Input Error
Jan 6, 2022 3:55:54 PM - Info ndmpagent (pid=19573) hogarage: RESTORE: RESTORE IS ABORTED
Jan 6, 2022 3:55:54 PM - Error ndmpagent (pid=19573) hogarage: DATA: Operation terminated: EVENT: I/O ERROR (for NDMP_TEMP_Restore_Test)
and sometime get below error:
It might occur because of having port blocked
Jan 10, 2022 11:55:20 AM - Error ndmpagent (pid=22637) ndmp_data_connect_v3 failed, status = -1 (-1)
Jan 10, 2022 11:55:20 AM - Error ndmpagent (pid=22637) NDMP restore failed from path /HOSVM/HOSVM_MW_WCCAPPLVP_domain
Jan 10, 2022 11:55:21 AM - Info ndmpagent (pid=22637) done. status: 25
Jan 10, 2022 11:55:21 AM - Info bptm (pid=22638) EXITING with status 5 <----------
Can anybody please share the experience or any advise...that would be highly appreciated.
Thanks
Solved! See The Solution
Could you please check if there is a firewall (ports) in place for 'inbound' connections to Media Server?
For Remote NDMP: (Restore process)
1) 1st - Control connection is initiated from Media Server to ---> Filer @Port:10000 (outbound)
2) 2nd - Next, Data connection is initiated from Filer to Media Server ----> Inbound @Port: Random
Many years ago I had an issue and the prices ended at 12 hours!
in my casei was doing a backup and indexing process was capped by the application at 12 hours.
not saying that is your problem exactly what I had since I was writing and you are reading but maybe it may be part of the reading of the indexing? Does the index spam tapes? There should be more logging on the NetBackup side. If not turn it up and try again to see if you can better approximate where the error is occurring. In other words, it should log if it is loading tapes, reading tapes, etc.
it’s also possible you have a physical media error
Hello TMACMD
Thanks for your sharing comments by your side.
Actually, we are using disk based media server, we are not using tapes for using NDMP backup jobs
Also, do you mind sharing the full log (ndmp) from netbackup ? and From NetApp:/etc/log/mlog/ndmpd.log
Is it remote/local/3-way NDMP backup/restore setup ?
While googling, I found a Netbackup KB article from Veritas which is perhaps what you are experiencing for the other error..
NDMP restore fails with exit status 5, 25 "abort_on_listen_connect_failure"
https://www.veritas.com/support/en_US/article.100032547
Once NDMP control connection is setup between Filer & Media server, it initiates Data connection back to Media Server (IP: Port), this Port can be seen in the log, and need to be allowed through firewall.
For I/O error, we need to see NDMP logs from netapp.
Also, I would raise a ticket with Veritas & NetApp to investigate performance issues.
Where is this IP?
tcp_addr[0].tcp_addr=172.16.2.230, port=2724
Could you share the backup (NAS/ndmp) logs from Netbackup.
this is the IP address of our media sever which is assigned as NDMP Host
The one log i shared before is from Netbackup, since I don't have access on Filer so I wouldn't be able to get ndmpd.log
I have attached one more log file taken from NDMP log of media server
Ok, I thought so, b'cos you mentioned remote NDMP, so it must be Media Server.
NDMP logs from FILER is key. Could you not reach out to someone from Storage team who can give you the logs from the FILER.
I looked both logs: There is no error reported from the Media Server logs, I can see data pipe is setup ok, read offset & length is successfully read but after that there isn't much.
Also, what Filer/OS version ?
Kindly check below job details showing error we are getting at the time we try to run NDMP restore job.
JOB DETAILS:
Jan 17, 2022 2:20:32 PM - Info ndmpagent (pid=20475) INF - Restoring NDMP files from /HOSVM/HOSVM_MW_WCCAPPLVP_domain/ to [See line below]
Jan 17, 2022 2:20:32 PM - Info ndmpagent (pid=20475) INF - Restoring NDMP files from [See line above] to /HOSVM/NDMP_TEMP_Restore_Test
Jan 17, 2022 2:20:32 PM - Info ndmpagent (pid=20475) NDMP Remote disk
Jan 17, 2022 2:20:34 PM - Info ndmpagent (pid=20475) This is CDOT restore
Jan 17, 2022 2:20:34 PM - Info ndmpagent (pid=20475) DAR disabled - continuing restore without DAR
Jan 17, 2022 2:20:34 PM - Info ndmpagent (pid=20475) Attempting normal restore.
Jan 17, 2022 2:20:35 PM - Info ndmpagent (pid=20475) hogarage: Session identifier for Restore : 60878
Jan 17, 2022 2:30:36 PM - Error ndmpagent (pid=20475) hogarage: RESTORE: Error: Input Error
Jan 17, 2022 2:30:36 PM - Info ndmpagent (pid=20475) hogarage: RESTORE: RESTORE IS ABORTED
Jan 17, 2022 2:30:36 PM - Error ndmpagent (pid=20475) hogarage: DATA: Operation terminated: EVENT: I/O ERROR (for NDMP_TEMP_Restore_Test)
Jan 17, 2022 2:30:37 PM - Error ndmpagent (pid=20475) NDMP restore failed from path /HOSVM/HOSVM_MW_WCCAPPLVP_domain
Jan 17, 2022 2:30:37 PM - Info ndmpagent (pid=20475) done. status: 5
Jan 17, 2022 2:30:37 PM - end reading; read time: 0:10:06
The Filer OS version is ONTAP 9.3
Could you please check if there is a firewall (ports) in place for 'inbound' connections to Media Server?
For Remote NDMP: (Restore process)
1) 1st - Control connection is initiated from Media Server to ---> Filer @Port:10000 (outbound)
2) 2nd - Next, Data connection is initiated from Filer to Media Server ----> Inbound @Port: Random
I believe the ports are open from Media Server to ---> Filer @Port:10000 (outbound)
And what we request before to open the ports bi-bidirectionally from both sides.
So the root cause of getting this I/O Error could be due to port blocked or it could be due to some other reason.
Jan 17, 2022 2:30:36 PM - Error ndmpagent (pid=20475) hogarage: DATA: Operation terminated: EVENT: I/O ERROR
our storage admin have already logged a case with NetApp support but until now we didn't get any response from them
Much appreciated for sharing your experience and useful comments
Many thanks, Ontapforrum for your kind advise.
It was found that ports were blocked from NetApp filer to Media server due to which restore process was failing.
There was a workaround to just open ports 65000-65009 from Firwall worked in our environment.
God bless you and all other members for their comments
Thanks