I've configured a SnapCreator Job using the Domino Plugin in an iSCSI Environment.
On the empty (only a couple of Mailboxes) System we had no issues doing the Snapshots.
No that the Customer added some more Data (around 570GB) the Quiesce Operation fails with:
ERROR: No valid response
ERROR: [scf-00054] Application unquiesce for plugin domino failed with exit code 1, Exiting!
When only selecting BOX-Files it runs through smoothly but there are only a handful of them.
With NTF or NSF Files it always comes back with the error above, keeps the watchdog hanging and the Service unresponsive.
During the Job we also encounter massive Load of Messages in DominoConsole like:
ti="0049C58D-C1257B34" sq="000012DE" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116 (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms ti="0049D146-C1257B34" sq="000012DF" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116 (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms ti="0049DD00-C1257B34" sq="000012E0" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116 (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms
Neither Customer nor me have a clue what goes wrong here.
Stefan was able to start the Snap Creator Agent manually and redirect the output to a log file so that he got real time feedback on what the Domino APIs are doing.
Hopefully he can share the exact command that was used.
In this instance they were replicating all of their data to new servers and Domino spent much of the time writing out messages like
“Clearing DBIID” and “Assigning new DBIID”
This went on for the entire time that the agent was scheduled to run - In this case this had to be done for each database.
Typically these actions happen very quickly, but when performed in conjunction with the other actions on the Domino server it slowed things down to where Snap Creator couldn't complete within the alloted time.
We decided to let all major Replications and other intense Tasks run previous to integrating the Machine into Backup.
With that the Error was avoided.
Now we "just" have left a Issue that the Domino Job sometimes spontaniously leaves a snapcreator.exe process behind, blocking the Agent. On Server side the Job is stated completed successful but every following Job fails to quiesce and unquiesce anything. Quite annoying when this happens with an Archivejob and the Disk runs out of Space, especially on Systems where you don't have logon rights to kill the process and restart the Agent.
Notice: This transmittal and/or attachments may be privileged or confidential. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error; any review, dissemination, or copying is strictly prohibited. If you received this transmittal in error, please notify us immediately by reply and immediately delete this message and all its attachments. Thank you.