2014-03-10 03:31 AM - edited 2015-12-18 12:25 AM
We are trying to implement SC 4.1 in a Clustered Ontap environment using the VIBE plugin.
First we had issues with snapvault updates taking forever to finish, and the agent and/or VIBE timed out. Don't know why the scripts are waiting for the snapvault operation to finish, seems a waste of time, once the snapmirror label has been set, the controllers do the rest...
We have since disabled the snapvault update, and we are just letting the primary snapshots label the snapshots as "daily", we have then created a protection policy and schedule on the controllers to allow for snapvault updates together with a snapmirror update (this is a cascading setup)
Right now we are trying to get the primary backup to work, but everytime it failes "due to agent timeout", it leaves a snapshot on some, if not all the machines involved in the backup. Next time the script runs, it starts to remove these snapshots from the machines, but this is done one machine at a time, and takes forever.. and the script times out... basically never ending story... I have looked at the VIBE documentation, but has not found any option to allow for any parallel deletes of snapshots...
Is there anyone who has had the same issues, and solved it somehow ? I was thinking of a kind of pre-script, which removed all snapshots from machines on a given datastore...
2014-03-10 08:07 AM
Is it possible to upload a scdump output, so that we can test this in the lab.
You may also email to sivar at netapp.com
I will review the logs, timeout etc and get back to you with a feedback.
2014-03-11 01:40 AM
We have a similiar issue. In our enviorement, sc sometimes get us an VIBE Timeout error and than the agent gets broken. After the timeout, no job is working anymore and we have to restart the agent.
Im talking about 60 backup jobs, every job is a VIBE job. If one of this Jobs failed with timeout, all other jobs there are later scheduled getting also error. before SC agent was rebuild on java (before sc4.1) we havent this issue.
We have opened a P2 Case but netapp wasnt able to help us until now.