Data Backup and Recovery
Data Backup and Recovery
Hi Everybody
I've configured a SnapCreator Job using the Domino Plugin in an iSCSI Environment.
On the empty (only a couple of Mailboxes) System we had no issues doing the Snapshots.
No that the Customer added some more Data (around 570GB) the Quiesce Operation fails with:
ERROR: No valid response
ERROR: [scf-00054] Application unquiesce for plugin domino failed with exit code 1, Exiting!
When only selecting BOX-Files it runs through smoothly but there are only a handful of them.
With NTF or NSF Files it always comes back with the error above, keeps the watchdog hanging and the Service unresponsive.
During the Job we also encounter massive Load of Messages in DominoConsole like:
ti="0049C58D-C1257B34" sq="000012DE" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116 (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms
ti="0049D146-C1257B34" sq="000012DF" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116 (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms
ti="0049DD00-C1257B34" sq="000012E0" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116 (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms
Neither Customer nor me have a clue what goes wrong here.
Any suggestions are welcome.
BR
Stefan Vollrath
ClientLogs, DebugLog and Config
Stefan,
What version of Snap Creator are you using?
Is it possible to get a copy of a debug log file so we can see additional information?
If agent logging is enabled the debug log from the agent would also be fantastic to have.
My best guess with the information that you have provided is we are hitting the agent timeout period and the Domino plug-in is exiting before the quiesce operation is completed.
I would need to see the log files to be sure.
Regardless this should not cause write locks on the Domino environment...
Thanks!
John
Hi John
We use SC 3.6.
Agent Timeout is already set very high to 3600Sec.
Since the Job dumps before that I assume something else goes wrong.
Last time it failed after roughly 10min.
Stefan,
Definitely need to see the logs...
An scdump would be great if you don't mind sending one, but at minimum I need to see a debug log from a failed instance.
Even the scdump doesn't contains the information from the agent, so if you can send an agent debug log as well it would be very helpful.
Can you email them to: spinks at netapp dot com?
Thanks!
John
DEBUG: Calling dominocore::quiesce with arguments(D:\Lotus\Domino\notes.ini,E:\Lotus\Domino\Data,D:\ChangeInfo,2 )
Your Changeinfo is also on D:\ drive?
I believe we need to keep the Changeinfo outside of Domino install location.
John Spinks can validate this.
What happens if you designate a separate drive for your ChangeInfo location?
Are you able to take snapshot from snapdrive (independent of snapcreator) of D:\ drive and E:\ Drive ? How long that takes.
Hi
Drive D only contains the DominoBinaries and the ChangeInfo.
All DominoData is on Drive E and the Logs are located on Drive G.
Additionally there is a DominoDAOS on O Drive.
SnapDrive Snapshot of Data, Logs and DAOS is done within seconds.
Freundliche Grüsse / Kind Regards
Stefan Vollrath
Gesendet: Mittwoch, 20. März 2013 15:48
An: Vollrath, Stefan
Betreff: - Re: Domino Plugin "No valid response"
NetApp Online Community <https://communities.netapp.com/index.jspa>
<http://media.netapp.com/images/divider-600x3.jpg>
Re: Domino Plugin "No valid response"
created by Siva Ramanathan <https://communities.netapp.com/people/sivar> in Snap Creator - Plugins - View the full discussion <https://communities.netapp.com/message/103259#103259> <http://media.netapp.com/images/divider-600x3.jpg>
DEBUG: Calling dominocore::quiesce with arguments(D:\Lotus\Domino\notes.ini,E:\Lotus\Domino\Data,D:\ChangeInfo,2 )
Your Changeinfo is also on D:\ drive?
I believe we need to keep the Changeinfo outside of Domino install location.
John Spinks can validate this.
What happens if you designate a separate drive for your ChangeInfo location?
Are you able to take snapshot from snapdrive (independent of snapcreator) of D:\ drive and E:\ Drive ? How long that takes.
Reply to this message by replying to this email -or- go to the message on NetApp Community <https://communities.netapp.com/message/103259#103259>
Start a new discussion in Snap Creator - Plugins by email <mailto:discussions-community-products_and_solutions-databases_and_enterprise_apps-snapcreator-plugins@communities.netapp.com> or at NetApp Community <https://communities.netapp.com/choose-container.jspa?contentType=1&containerType=14&container=2539>
Hi Stefan,
Any news on this?
Kind regards,
Gert
Stefan was able to start the Snap Creator Agent manually and redirect the output to a log file so that he got real time feedback on what the Domino APIs are doing.
Hopefully he can share the exact command that was used.
In this instance they were replicating all of their data to new servers and Domino spent much of the time writing out messages like
“Clearing DBIID” and “Assigning new DBIID”
This went on for the entire time that the agent was scheduled to run - In this case this had to be done for each database.
Typically these actions happen very quickly, but when performed in conjunction with the other actions on the Domino server it slowed things down to where Snap Creator couldn't complete within the alloted time.
Hopefully this helps.
John
Thanks.
Gert
Hi Gert
We decided to let all major Replications and other intense Tasks run previous to integrating the Machine into Backup.
With that the Error was avoided.
Now we "just" have left a Issue that the Domino Job sometimes spontaniously leaves a snapcreator.exe process behind, blocking the Agent. On Server side the Job is stated completed successful but every following Job fails to quiesce and unquiesce anything. Quite annoying when this happens with an Archivejob and the Disk runs out of Space, especially on Systems where you don't have logon rights to kill the process and restart the Agent.
BR Stefan
Hello Stefan,
I can take it to the engineering and report this issue to investigate further.
Could you please send a small write up with your config / logs to sivar @ netapp.com
I can submit a defect request and troubleshoot this further with our dev/qa team.
Thanks,
Siva Ramanathan
SnapCreator Community Moderator
Hi Siva
The “No vailid response” is in fact a „cannot connect to client“ as I had to find out the hard way.
Very often happens when either Schedules Tasks try to start concurrently or when an old Job hang up, leaving a snapcreator.exe running and the Agent blocked (Windows single threaded Agent).
SC Server does not distinguish these various conditions. Message would need to be adjusted to reflect a more fitting comment depending on the Situation.
Situation got better a bit in 4.0p1, at least now Jobs started via Scheduler wait for a previous one to free the Agent. Blocked Agents still are able to cause that incorrect response.
Freundliche Grüsse / Kind Regards
Stefan Vollrath
T-Systems Schweiz AG
Storage & Backup Operations
Stefan Vollrath
Storage Engineer
Murgenthalstrasse 12, CH-4901 Langenthal
+41 (0) 78 645 1076 (phone)
E-Mail: stefan.vollrath@t-systems.com
http://www.t-systems.ch <http://www.t-systems.ch/>
Notice: This transmittal and/or attachments may be privileged or confidential. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error; any review, dissemination, or copying is strictly prohibited. If you received this transmittal in error, please notify us immediately by reply and immediately delete this message and all its attachments. Thank you.
Gesendet: Montag, 29. Juli 2013 15:27
An: Vollrath, Stefan
Betreff: - Re: Domino Plugin "No valid response"
NetApp Online Community <https://communities.netapp.com/index.jspa>
<http://media.netapp.com/images/divider-600x3.jpg>
Re: Domino Plugin "No valid response"
created by Siva Ramanathan <https://communities.netapp.com/people/sivar> in Snap Creator - Plugins - View the full discussion <https://communities.netapp.com/message/113889#113889> <http://media.netapp.com/images/divider-600x3.jpg>
Hello Stefan,
I can take it to the engineering and report this issue to investigate further.
Could you please send a small write up with your config / logs to sivar @ netapp.com
I can submit a defect request and troubleshoot this further with our dev/qa team.
Thanks,
Siva Ramanathan
SnapCreator Community Moderator
Reply to this message by replying to this email -or- go to the message on NetApp Community <https://communities.netapp.com/message/113889#113889>
Start a new discussion in Snap Creator - Plugins by email <mailto:discussions-community-products_and_solutions-databases_and_enterprise_apps-snapcreator-plugins@communities.netapp.com> or at NetApp Community <https://communities.netapp.com/choose-container.jspa?contentType=1&containerType=14&container=2539>
Ok. Thanks much Stefan for the explanation.
4.1 agent is being reworked significantly, and I will pass this to the Dev team as a cause of concern.
Have a great day.
Regards,
Siva Ramanathan
SnapCreator Community Moderator