Data Backup and Recovery

snap creator 4.1.1 failing to quiesce domino

stuartwalton82
11,104 Views

Hi all, looking for a little help wih getting snap creator working on one of our domino servers.


We're using the exact same configuration, snap creator version, java version, snap drive version as on other domino servers where snap creator is working.


Software versions on server

Java 7 Update 79(64 bit)
Snap Creator Framework 4.1.1
Snapdrive 7.1.1.6640

Domino Product Version 9.0.14.15158


Attached is a log file of the backup I manually started last night. Any help appreciated.


Thanks

13 REPLIES 13

stuartwalton82
11,054 Views

I have already tried upgrading to snapcreator 4.1.2 to no avail.

 

sivar
11,024 Views

It looks like your domino backup (quiesce) takes longer than 5 minutes as observed from the logs.

 

You may edit your config file and set 

SC_AGENT_TIMEOUT=3600

 

The default value is 5 minutes or 300 seconds.

Here we can set to an hour (3600 seconds) and try a backup again

 

 

stuartwalton82
11,022 Views

Hi Sivar, thanks for the reply however I have already adjusted that threshold and this made no difference.

 

Here's what I have it set at

 

#     Agent Options                                                                                                    #
########################################################################################################################
SC_AGENT_UNQUIESCE_TIMEOUT=305
SC_AGENT_WATCHDOG_ENABLE=N
SC_AGENT=domino01.domain.co.uk:9090
SC_AGENT_LOG_ENABLE=Y
SC_AGENT_TIMEOUT=3600

sivar
11,014 Views

I reviewed the logs again.

########## Application Quiesce ##########
[2015-12-15 21:00:26,733] INFO: Application Quiesce for plugin : domino
[2015-12-15 21:05:45,973] ERROR: [domino01.mearsgroup.co.uk:9090 (4.1.1.1)] Operation failed. Reason: The watchdog killed the running operation. Verify that plugins/wrapper/wrapper binary has execute permission set. Increase the WRAPPER_TIMEOUT_IN_MSEC value in agent.properties if operations require more time to complete.
[2015-12-15 21:05:45,973] ERROR: [domino01.mearsgroup.co.uk:9090(4.1.1.1)] SCF-00038: Application quiesce for plug-in [domino] failed with exit code [102], continuing with backup.
[2015-12-15 21:05:46,093] INFO: Application quiesce result is empty  from plugin domino. skipping config update operation
[2015-12-15 21:05:46,093] INFO: Application Quiesce for plugin : domino finished successfully

 

Looks like exactly after 5 minutes 19 seconds a watchdog operation has killed the running operation. (backup)

 

But, interestingly your config file does not even have watchdog enabled.

 

I am thinking we can try the below

SC_AGENT_WATCHDOG_ENABLE=Y

SC_AGENT_UNQUIESCE_TIMEOUT=3605

 

Thanks,
Siva Ramanathan

 

 

sivar
11,013 Views

correcting my own statement

 

as per

https://library.netapp.com/ecmdocs/ECMP1650312/html/GUID-382C4683-8746-406C-82C3-DF3DA782DCD2.html

 

Note: SC_AGENT_WATCHDOG_ENABLE is deprecated for use with Snap Creator Agent 4.1, and applicable only for use with Agent 4.0.

 

Hence, you may just set

SC_AGENT_UNQUIESCE_TIMEOUT=3605

stuartwalton82
11,019 Views

Not sure how relevant this is but I noticed the wrapper.exe process does not run during the backup process

stuartwalton82
11,005 Views

Hi Sava, unquiesce is not the issue unfortunately it is the quiesce stage.

 

Here's an extract from another of our domino servers having the same issue

 

2015-12-16 13:46:38,332] INFO: STORAGE-02091: Listing Snapshot copies on volume [domino02_maildb] finished successfully.
########## Storage discovery finished successfully ! ##########
########## Pre Application Quiesce commands ##########
[2015-12-16 13:46:38,348] INFO: Pre application quiesce commands are not defined
[2015-12-16 13:46:38,349] INFO: Pre Application Quiesce commands finished successfully
[2015-12-16 13:46:38,356] INFO: Validate volume is not enabled skipping validate volume task.
########## Application Quiesce ##########
[2015-12-16 13:46:38,361] INFO: Application Quiesce for plugin : domino
[2015-12-16 13:46:44,243] ERROR: [domino02.domain.co.uk:9090 (4.1.1.1)] Operation failed. Reason: The watchdog killed the running operation. Verify that plugins/wrapper/wrapper binary has execute permission set. Increase the WRAPPER_TIMEOUT_IN_MSEC value in agent.properties if operations require more time to complete.
[2015-12-16 13:46:44,243] ERROR: [domino02.domain.co.uk:9090(4.1.1.1)] SCF-00038: Application quiesce for plug-in [domino] failed with exit code [102], continuing with backup.
[2015-12-16 13:46:44,252] INFO: Application quiesce result is empty  from plugin domino. skipping config update operation
[2015-12-16 13:46:44,252] INFO: Application Quiesce for plugin : domino finished successfully
########## Application Quiesce finished successfully ##########
########## POST APPLICATION QUIESCE COMMANDS ##########
[2015-12-16 13:46:44,254] INFO: Post application quiesce commands are not defined
########## POST APPLICATION QUIESCE COMMANDS FINISHED SUCCESSFULLY  ##########
########## Pre NTAP commands ##########
[2015-12-16 13:46:44,261] INFO: Pre NTAP commands are not defined
[2015-12-16 13:46:44,262] INFO: Pre NTAP commands finished successfully
########## Generating Info ASUP on fernsan04 ##########
[2015-12-16 13:46:44,375] INFO: STORAGE-01001: Creating AutoSupport message with event id [0], category [Backup Started], description [INFO: Snap Creator Framework 4.1.1 Backup for domino02 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino02_snapinfo,domino02_maildb, Started], level [INFO], hostname [domino02 [domino02.domain.co.uk:9090]]
[2015-12-16 13:46:44,375] DEBUG: <ems-autosupport-log>
    <computer-name>domino02 [domino02.domain.co.uk:9090]</computer-name>
    <event-id>0</event-id>
    <event-source>SNAPCREATOR</event-source>
    <app-version>Snap Creator Framework 4.1.1</app-version>
    <category>Backup Started</category>
    <event-description>INFO: Snap Creator Framework 4.1.1 Backup for domino02 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino02_snapinfo,domino02_maildb, Started</event-description>
    <log-level>6</log-level>
    <auto-support>false</auto-support>
</ems-autosupport-log>

sivar
10,995 Views

During my internal bug search on this error, I did see a bug that seems to have been fixed with an upcoming release 5.0

 

Meanwhile, please share me the values below from your etc/agent.properties file (this file is in the agent install folder on the domino server)

OPERATION_TIMEOUT_IN_MSEC

WRAPPER_TIMEOUT_IN_MSEC

 

Thanks,
Siva Ramanathan

 

sivar
10,058 Views

SC_AGENT_UNQUIESCE_TIMEOUT is infact for the quiesce operation.

Basically telling the plugin (watchdog) to wait this long for the quiesce operation and if the quiesce is taking longer, then the watchdog takes the database out of hotbackup mode.

 

I see your current value is 5 minutes SC_AGENT_UNQUIESCE_TIMEOUT

So, pelase try increasing that value to 1 hour as suggested earlier, and let me know the results.

 

I am enquiring the developers/TME about 5.0 release date.

 

I shall update as soon as I learn more.

 

Thanks,

Siva Ramanathan

stuartwalton82
10,994 Views

Thanks Siva, when is 5.0 due for release ?

 

Sure, here's the whole log file

 

DEFAULT_PORT=9090
KEYSTORE_FILE=etc/keystore.jks
KEYSTORE_PASS=snapcreator
CONTEXT_LIFETIME_IN_MSEC=1800000
CONTEXT_CLEANUP_FREQUENCY_IN_MSEC=30000
STORAGE_RECEIVE_TIMEOUT_IN_MSEC=600000
STORAGE_CONNECTION_TIMEOUT_IN_MSEC=60000
CATALOG_RECEIVE_TIMEOUT_IN_MSEC=60000
CATALOG_CONNECTION_TIMEOUT_IN_MSEC=30000
WATCHDOG_LISTENER_MAX_DELAY_IN_MSEC=10000
OPERATION_TIMEOUT_IN_MSEC=3600000
AUTHORIZED_HOSTS=*
THREAD_POOL_SIZE=8
RESULT_MESSAGES_LOG_LEVEL=INFO
WRAPPER_TIMEOUT_IN_MSEC=3600000
PRINT_CONFIG_VALUES_IN_LOG=false

stuartwalton82
10,056 Views

Hi Siva, we adjusted earlier and reran the backup. here is the log file

 

[2015-12-16 13:46:35,901] INFO: Validating policy: daily finished successfully
########## Detecting Data ONTAP mode for sourcefiler01 ##########
[2015-12-16 13:46:36,679] INFO: STORAGE-03031: Getting system version details of [destfiler01]
[2015-12-16 13:46:36,679] INFO: STORAGE-03032: Getting system version details of [destfiler01] finished successfully.
[2015-12-16 13:46:37,118] INFO: STORAGE-03031: Getting system version details of [sourcefiler01]
[2015-12-16 13:46:37,118] INFO: STORAGE-03032: Getting system version details of [sourcefiler01] finished successfully.
########## Agent validation ##########
[2015-12-16 13:46:37,300] INFO: Agent validation completed successfully for agent domino01.domain.co.uk:9090
########## Plugin validation ##########
[2015-12-16 13:46:37,337] INFO: Plugin validation completed successfully for plugin domino
[2015-12-16 13:46:37,349] INFO: Application auto discovery is not enabled skipping .
########## Running storage discovery ##########
[2015-12-16 13:46:37,735] INFO: STORAGE-02073: Retrieving SnapMirror relationships
[2015-12-16 13:46:37,735] INFO: STORAGE-02074: Retrieving SnapMirror relationships on controller [sourcefiler01] finished successfully
[2015-12-16 13:46:38,065] INFO: STORAGE-02070: Retrieving SnapMirror status
[2015-12-16 13:46:38,065] INFO: STORAGE-02071: Retrieving SnapMirror status on controller [destfiler01] successful
[2015-12-16 13:46:38,204] INFO: STORAGE-02090: Listing Snapshot copies on volume [domino01_snapinfo].
[2015-12-16 13:46:38,204] INFO: STORAGE-02091: Listing Snapshot copies on volume [domino01_snapinfo] finished successfully.
[2015-12-16 13:46:38,332] INFO: STORAGE-02090: Listing Snapshot copies on volume [domino01_maildb].
[2015-12-16 13:46:38,332] INFO: STORAGE-02091: Listing Snapshot copies on volume [domino01_maildb] finished successfully.
########## Storage discovery finished successfully ! ##########
########## Pre Application Quiesce commands ##########
[2015-12-16 13:46:38,348] INFO: Pre application quiesce commands are not defined
[2015-12-16 13:46:38,349] INFO: Pre Application Quiesce commands finished successfully
[2015-12-16 13:46:38,355] INFO: Validate volume is not enabled skipping validate volume task.
########## Application Quiesce ##########
[2015-12-16 13:46:38,361] INFO: Application Quiesce for plugin : domino
[2015-12-16 13:46:44,243] ERROR: [domino01.domain.co.uk:9090 (4.1.1.1)] Operation failed. Reason: The watchdog killed the running operation. Verify that plugins/wrapper/wrapper binary has execute permission set. Increase the WRAPPER_TIMEOUT_IN_MSEC value in agent.properties if operations require more time to complete.
[2015-12-16 13:46:44,243] ERROR: [domino01.domain.co.uk:9090(4.1.1.1)] SCF-00038: Application quiesce for plug-in [domino] failed with exit code [102], continuing with backup.
[2015-12-16 13:46:44,252] INFO: Application quiesce result is empty  from plugin domino. skipping config update operation
[2015-12-16 13:46:44,252] INFO: Application Quiesce for plugin : domino finished successfully
########## Application Quiesce finished successfully ##########
########## POST APPLICATION QUIESCE COMMANDS ##########
[2015-12-16 13:46:44,254] INFO: Post application quiesce commands are not defined
########## POST APPLICATION QUIESCE COMMANDS FINISHED SUCCESSFULLY  ##########
########## Pre NTAP commands ##########
[2015-12-16 13:46:44,261] INFO: Pre NTAP commands are not defined
[2015-12-16 13:46:44,262] INFO: Pre NTAP commands finished successfully
########## Generating Info ASUP on sourcefiler01 ##########
[2015-12-16 13:46:44,375] INFO: STORAGE-01001: Creating AutoSupport message with event id [0], category [Backup Started], description [INFO: Snap Creator Framework 4.1.1 Backup for domino01 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino01_snapinfo,domino01_maildb, Started], level [INFO], hostname [domino01 [domino01.domain.co.uk:9090]]
[2015-12-16 13:46:44,375] INFO: STORAGE-01002: Creating AutoSupport message with event id [0], category [Backup Started], description [INFO: Snap Creator Framework 4.1.1 Backup for domino01 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino01_snapinfo,domino01_maildb, Started], level [INFO], hostname [domino01 [domino01.domain.co.uk:9090]] successful
[2015-12-16 13:46:44,375] INFO: ASUP finished successfully on sourcefiler01
########## Running NetApp Snapshot copy Rename on Primary devices ##########
########## Running NetApp Snapshot copy Rename on Secondary devices ##########
########## File system plug-in not defined. Skipping file system quiesce ##########
########## SNAPSHOT CREATE COMMANDS ##########
[2015-12-16 13:47:04,743] INFO: [domino01.domain.co.uk:9090 (4.1.1.1)] Executing Snapshot create  command ["C:\Program Files\Netapp\SnapDrive\sdcli.exe"  snap create -s domino01-daily_20151216134635 -D D] on domino01.domain.co.uk
[2015-12-16 13:47:04,744] INFO: Snapshot create  completed successfully
########## Snapshot copy create commands finished successfully ##########
[2015-12-16 13:47:04,750] INFO: Skipping snapshot creation using ZAPI.
########## File system plug-in not defined. Skipping file system unquiesce ##########
########## Pre Application Unquiesce commands ##########
[2015-12-16 13:47:04,753] INFO: Pre Application Unquiesce commands are not defined
[2015-12-16 13:47:04,753] INFO: Pre Application Unquiesce commands finished successfully
########## Application Unquiesce ##########
[2015-12-16 13:47:04,755] INFO: Application Unquiesce for plugin: domino
[2015-12-16 13:47:10,158] ERROR: [domino01.domain.co.uk:9090 (4.1.1.1)] Operation failed. Verify that plugins/wrapper/wrapper binary has execute permission set. Increase the WRAPPER_TIMEOUT_IN_MSEC value in agent.properties if operations require more time to complete.
[2015-12-16 13:47:10,158] ERROR: SCF-00033: Application unquiesce for plug-in [domino] failed with exit code [102], proceeding with backup!
[2015-12-16 13:47:10,158] INFO: Application unquiesce result is empty from plugin domino. skipping config update operation
[2015-12-16 13:47:10,159] INFO: Application UnQuiesce for plugin: domino finished successfully
########## Application Unquiesce finished successfully ##########
########## POST APPLICATION UNQUIESCE COMMANDS ##########
[2015-12-16 13:47:10,172] INFO: Post application unquiesce commands are not defined
########## POST APPLICATION UNQUIESCE COMMANDS FINISHED SUCCESSFULLY  ##########
########## Generating Info ASUP on sourcefiler01 ##########
[2015-12-16 13:47:10,286] INFO: STORAGE-01001: Creating AutoSupport message with event id [0], category [Backup Completed], description [INFO: Snap Creator Framework 4.1.1 Backup for domino01 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino01_snapinfo,domino01_maildb, finished], level [INFO], hostname [domino01 [domino01.domain.co.uk:9090]]
[2015-12-16 13:47:10,286] INFO: STORAGE-01002: Creating AutoSupport message with event id [0], category [Backup Completed], description [INFO: Snap Creator Framework 4.1.1 Backup for domino01 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino01_snapinfo,domino01_maildb, finished], level [INFO], hostname [domino01 [domino01.domain.co.uk:9090]] successful
[2015-12-16 13:47:10,286] INFO: ASUP finished successfully on sourcefiler01
[2015-12-16 13:47:20,288] INFO: Generating Info ASUP on destfiler01
[2015-12-16 13:47:20,393] INFO: STORAGE-01001: Creating AutoSupport message with event id [0], category [SnapMirror Backup of destfiler01:dr_domino01_maildb Started], description [INFO: Snap Creator Framework 4.1.1 SnapMirror Backup of destfiler01:dr_domino01_maildb for domino01 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino01_snapinfo,domino01_maildb, Started], level [INFO], hostname [domino01 [domino01.domain.co.uk:9090]]
[2015-12-16 13:47:20,394] INFO: STORAGE-01002: Creating AutoSupport message with event id [0], category [SnapMirror Backup of destfiler01:dr_domino01_maildb Started], description [INFO: Snap Creator Framework 4.1.1 SnapMirror Backup of destfiler01:dr_domino01_maildb for domino01 ACTION: backup POLICY: daily Plugin: domino [Supported] Volumes: domino01_snapinfo,domino01_maildb, Started], level [INFO], hostname [domino01 [domino01.domain.co.uk:9090]] successful
########## Running SnapMirror Update for source relationship sourcefiler01:domino01_maildb ##########
[2015-12-16 13:47:20,544] INFO: STORAGE-02083: Updating SnapMirror relationship [sourcefiler01:domino01_maildb -> destfiler01:dr_domino01_maildb].
[2015-12-16 13:47:20,544] ERROR: com.netapp.snapcreator.storage.executor.ZapiExecutorException: netapp.manage.NaAPIFailedException: Snapmirror error: transfer attempted for busy destination  (errno=13102)
    at com.netapp.snapcreator.storage.executor.ZapiExecutorImpl.run(ZapiExecutorImpl.java:54)
    at com.netapp.snapcreator.storage.api.ontap.Ontap7ModeApi.executeRequest(Ontap7ModeApi.java:2242)
    at com.netapp.snapcreator.storage.api.ontap.Ontap7ModeApi.snapMirrorUpdate(Ontap7ModeApi.java:1275)
    at com.netapp.snapcreator.storage.StorageCoreImpl.snapMirrorUpdate(StorageCoreImpl.java:883)
    at com.netapp.snapcreator.workflow.task.ZAPITask.snapMirrorUpdate(ZAPITask.java:383)
    at com.netapp.snapcreator.workflow.task.SnapMirrorTask.execute(SnapMirrorTask.java:53)
    at com.netapp.snapcreator.workflow.impl.SCTaskCallableBlocking.call(SCTaskCallableBlocking.java:49)
    at com.netapp.snapcreator.workflow.impl.SCTaskCallableBlocking.call(SCTaskCallableBlocking.java:18)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: netapp.manage.NaAPIFailedException: Snapmirror error: transfer attempted for busy destination  (errno=13102)
    at netapp.manage.NaServer.invokeElem(NaServer.java:687)
    at com.netapp.snapcreator.storage.executor.ZapiExecutorImpl.run(ZapiExecutorImpl.java:45)
    ... 11 more

[2015-12-16 13:47:20,545] ERROR: STORAGE-02085: Updating SnapMirror relationship [sourcefiler01:domino01_maildb -> destfiler01:dr_domino01_maildb] failed with error [netapp.manage.NaAPIFailedException: Snapmirror error: transfer attempted for busy destination  (errno=13102)].
[2015-12-16 13:47:20,545] ERROR: Running SnapMirror Update on destination destfiler01:dr_domino01_maildb using source sourcefiler01:domino01_maildbfailed !
########## Application cleanup ##########
[2015-12-16 13:47:20,673] INFO: Performing cleanup on : domino
[2015-12-16 13:47:25,871] ERROR: [domino01.domain.co.uk:9090 (4.1.1.1)] Operation failed. Verify that plugins/wrapper/wrapper binary has execute permission set. Increase the WRAPPER_TIMEOUT_IN_MSEC value in agent.properties if operations require more time to complete.
[2015-12-16 13:47:25,872] ERROR: [domino01.domain.co.uk:9090(4.1.1.1)] SCF-00074: Cleanup for plug-in [domino] failed with error [null] and exit code [102], Exiting!
########## Agent Workflow Finalization ##########
[2015-12-16 13:47:25,875] INFO: Agent Workflow Finalization started
[2015-12-16 13:47:31,176] INFO: [domino01.domain.co.uk:9090 (4.1.1.1)] Finalized workflow with id 1
[2015-12-16 13:47:31,177] INFO: Agent Workflow Finalization finished successfully
########## Snap Creator Framework 4.1.1 failed ##########
[2015-12-16 13:47:31,214] INFO: Pre Exit commands are not defined. Skipping !

sivar
10,046 Views

It fails immediately at the quiesce stage, as I could read from the logs.

 

[2015-12-16 13:46:38,361] INFO: Application Quiesce for plugin : domino
[2015-12-16 13:46:44,243] ERROR:

 

May I request you to register a case with NetApp support and work with the support engineer?

 

Also, I was advised by our TME that 5.0 is an internal version number and you need to setup a NDA call to discuss the roadmap items.

 

Sorry, I could not be of more help here.


Thanks,

Siva Ramanathan

 

stuartwalton82
10,028 Views

Thanks Siva I will log a call with NetApp

Public