Data Backup and Recovery

Domino Plugin "No valid response"

SVOLLRAT1
9,240 Views

Hi Everybody

I've configured a SnapCreator Job using the Domino Plugin in an iSCSI Environment.

On the empty (only a couple of Mailboxes) System we had no issues doing the Snapshots.

No that the Customer added some more Data (around 570GB) the Quiesce Operation fails with:

ERROR: No valid response

ERROR: [scf-00054] Application unquiesce for plugin domino failed with exit code 1, Exiting!

When only selecting BOX-Files it runs through smoothly but there are only a handful of them.

With NTF or NSF Files it always comes back with the error above, keeps the watchdog hanging and the Service unresponsive.

During the Job we also encounter massive Load of Messages in DominoConsole like:

ti="0049C58D-C1257B34" sq="000012DE" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116  (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms
ti="0049D146-C1257B34" sq="000012DF" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116  (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms

ti="0049DD00-C1257B34" sq="000012E0" THREAD [11D0:0006-0FA0] WAITING FOR WRITE LOCK ON RWSEM 0x4116  (@07F10D64) (R=0,W=3,WRITER=0FE0:11E4,OWNER=0FE0:11E4) FOR 30000 ms


Neither Customer nor me have a clue what goes wrong here.

Any suggestions are welcome.

BR


Stefan Vollrath

ClientLogs, DebugLog and Config

12 REPLIES 12

spinks
9,186 Views

Stefan,

What version of Snap Creator are you using?

Is it possible to get a copy of a debug log file so we can see additional information?

If agent logging is enabled the debug log from the agent would also be fantastic to have.

My best guess with the information that you have provided is we are hitting the agent timeout period and the Domino plug-in is exiting before the quiesce operation is completed.

I would need to see the log files to be sure.

Regardless this should not cause write locks on the Domino environment...

Thanks!

John

SVOLLRAT1
9,186 Views

Hi John

We use SC 3.6.

Agent Timeout is already set very high to 3600Sec.

Since the Job dumps before that I assume something else goes wrong.

Last time it failed after roughly 10min.

spinks
9,186 Views

Stefan,

Definitely need to see the logs...

An scdump would be great if you don't mind sending one, but at minimum I need to see a debug log from a failed instance.

Even the scdump doesn't contains the information from the agent, so if you can send an agent debug log as well it would be very helpful.

Can you email them to: spinks at netapp dot com?

Thanks!

John

sivar
9,186 Views

DEBUG: Calling dominocore::quiesce with arguments(D:\Lotus\Domino\notes.ini,E:\Lotus\Domino\Data,D:\ChangeInfo,2 )

Your Changeinfo is also on D:\ drive?

I believe we need to keep the Changeinfo outside of Domino install location.

John Spinks can validate this.

What happens if you designate a separate drive for your ChangeInfo location?

Are you able to take snapshot from snapdrive (independent of snapcreator) of D:\ drive and E:\ Drive ? How long that takes.

SVOLLRAT1
9,186 Views

Hi

Drive D only contains the DominoBinaries and the ChangeInfo.

All DominoData is on Drive E and the Logs are located on Drive G.

Additionally there is a DominoDAOS on O Drive.

SnapDrive Snapshot of Data, Logs and DAOS is done within seconds.

Freundliche Grüsse / Kind Regards

Stefan Vollrath

Von: Siva Ramanathan

Gesendet: Mittwoch, 20. März 2013 15:48

An: Vollrath, Stefan

Betreff: - Re: Domino Plugin "No valid response"

NetApp Online Community <https://communities.netapp.com/index.jspa>

<http://media.netapp.com/images/divider-600x3.jpg>

Re: Domino Plugin "No valid response"

created by Siva Ramanathan <https://communities.netapp.com/people/sivar> in Snap Creator - Plugins - View the full discussion <https://communities.netapp.com/message/103259#103259> <http://media.netapp.com/images/divider-600x3.jpg>

DEBUG: Calling dominocore::quiesce with arguments(D:\Lotus\Domino\notes.ini,E:\Lotus\Domino\Data,D:\ChangeInfo,2 )

Your Changeinfo is also on D:\ drive?

I believe we need to keep the Changeinfo outside of Domino install location.

John Spinks can validate this.

What happens if you designate a separate drive for your ChangeInfo location?

Are you able to take snapshot from snapdrive (independent of snapcreator) of D:\ drive and E:\ Drive ? How long that takes.

Reply to this message by replying to this email -or- go to the message on NetApp Community <https://communities.netapp.com/message/103259#103259>

Start a new discussion in Snap Creator - Plugins by email <mailto:discussions-community-products_and_solutions-databases_and_enterprise_apps-snapcreator-plugins@communities.netapp.com> or at NetApp Community <https://communities.netapp.com/choose-container.jspa?contentType=1&containerType=14&container=2539>

gert_leunis
9,186 Views

Hi Stefan,

Any news on this?

Kind regards,

Gert

spinks
9,186 Views

Stefan was able to start the Snap Creator Agent manually and redirect the output to a log file so that he got real time feedback on what the Domino APIs are doing.

Hopefully he can share the exact command that was used.

In this instance they were replicating all of their data to new servers and Domino spent much of the time writing out messages like

“Clearing DBIID” and “Assigning new DBIID”

This went on for the entire time that the agent was scheduled to run - In this case this had to be done for each database.

Typically these actions happen very quickly, but when performed in conjunction with the other actions on the Domino server it slowed things down to where Snap Creator couldn't complete within the alloted time.

Hopefully this helps.

John


gert_leunis
9,186 Views

Thanks.

Gert

SVOLLRAT1
9,186 Views

Hi Gert

We decided to let all major Replications and other intense Tasks run previous to integrating the Machine into Backup.

With that the Error was avoided.

Now we "just" have left a Issue that the Domino Job sometimes spontaniously leaves a snapcreator.exe process behind, blocking the Agent. On Server side the Job is stated completed successful but every following Job fails to quiesce and unquiesce anything. Quite annoying when this happens with an Archivejob and the Disk runs out of Space, especially on Systems where you don't have logon rights to kill the process and restart the Agent.

BR Stefan

sivar
8,439 Views

Hello Stefan,

I can take it to the engineering and report this issue to investigate further.

Could you please send a small write up with your config / logs to sivar @ netapp.com

I can submit a defect request and troubleshoot this further with our dev/qa team.

Thanks,
Siva Ramanathan

SnapCreator Community Moderator

SVOLLRAT1
8,439 Views

Hi Siva

The “No vailid response” is in fact a „cannot connect to client“ as I had to find out the hard way.

Very often happens when either Schedules Tasks try to start concurrently or when an old Job hang up, leaving a snapcreator.exe running and the Agent blocked (Windows single threaded Agent).

SC Server does not distinguish these various conditions. Message would need to be adjusted to reflect a more fitting comment depending on the Situation.

Situation got better a bit in 4.0p1, at least now Jobs started via Scheduler wait for a previous one to free the Agent. Blocked Agents still are able to cause that incorrect response.

Freundliche Grüsse / Kind Regards

Stefan Vollrath

T-Systems Schweiz AG

Storage & Backup Operations

Stefan Vollrath

Storage Engineer

Murgenthalstrasse 12, CH-4901 Langenthal

+41 (0) 78 645 1076 (phone)

E-Mail: stefan.vollrath@t-systems.com

http://www.t-systems.ch <http://www.t-systems.ch/>

Notice: This transmittal and/or attachments may be privileged or confidential. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error; any review, dissemination, or copying is strictly prohibited. If you received this transmittal in error, please notify us immediately by reply and immediately delete this message and all its attachments. Thank you.

Von: Siva Ramanathan

Gesendet: Montag, 29. Juli 2013 15:27

An: Vollrath, Stefan

Betreff: - Re: Domino Plugin "No valid response"

NetApp Online Community <https://communities.netapp.com/index.jspa>

<http://media.netapp.com/images/divider-600x3.jpg>

Re: Domino Plugin "No valid response"

created by Siva Ramanathan <https://communities.netapp.com/people/sivar> in Snap Creator - Plugins - View the full discussion <https://communities.netapp.com/message/113889#113889> <http://media.netapp.com/images/divider-600x3.jpg>

Hello Stefan,

I can take it to the engineering and report this issue to investigate further.

Could you please send a small write up with your config / logs to sivar @ netapp.com

I can submit a defect request and troubleshoot this further with our dev/qa team.

Thanks,

Siva Ramanathan

SnapCreator Community Moderator

Reply to this message by replying to this email -or- go to the message on NetApp Community <https://communities.netapp.com/message/113889#113889>

Start a new discussion in Snap Creator - Plugins by email <mailto:discussions-community-products_and_solutions-databases_and_enterprise_apps-snapcreator-plugins@communities.netapp.com> or at NetApp Community <https://communities.netapp.com/choose-container.jspa?contentType=1&containerType=14&container=2539>

sivar
8,439 Views

Ok. Thanks much Stefan for the explanation.

4.1 agent is being reworked significantly, and I will pass this to the Dev team as a cause of concern.

Have a great day.


Regards,

Siva Ramanathan

SnapCreator Community Moderator

Public