Data Backup and Recovery

Snapcreator problem after Changing target path of system

Ronny_Fiebig
2,905 Views

Hello all.

 

Since Christmas we have a problem with the Snapcreator tool. The problem exists with a configuration already discussed here in 2012.
Since then the configuration hasn't changed much, new systems were added and others were switched off.
Before Christmas, the path to db2 of a system was changed from /db2/db2XXX/db2_software/bin/db2 to /db2/db2XXX/db2_software_v105/bin/db2, which is a very small change that should not affect how the Snapcreator works.

We have several error messages, but they don't really help. The database's quiescence works from the point of view of the ScAgent, but the Snapcreator server thinks about itself, it runs on an error. We were able to narrow down the problem to the point where the first system always works properly and the backup is valid. At the second system (irrespective of which one) the problems occur.

The current suspicion is that the (un-)quiesce takes longer than the default timeout (60 seconds) settings on the agent allow. The agent returns the success message too late and the SnapCreator runs in a timeout. Our approach is to increase the timeout to 180 seconds and see if the backup is running correctly. There may be a better solution in the community.

Error on SnapCreator on quiesce and unqiesce (system specific log):

 

########## Application quiesce ##########
[Mon Jan  1 18:21:45 2018] ERROR: 500 read timeout at /<D:\Program Files\IBM_N_Series_Snap_Creator_Framework\scServer3.6.0\snapcreator.exe>SnapCreator/Agent/Remote.pm line 541

[Mon Jan  1 18:21:45 2018] [sapXXXci.lvs.CUSTOMER:9090(3.6.0.1)] ERROR: [scf-00053] Application quiesce for plugin db2 failed with exit code 1, Exiting!

########## Application unquiesce ##########
[Mon Jan  1 18:23:38 2018] ERROR: 500 read timeout at /<D:\Program Files\IBM_N_Series_Snap_Creator_Framework\scServer3.6.0\snapcreator.exe>SnapCreator/Agent/Remote.pm line 541

[Mon Jan  1 18:23:38 2018] ERROR: [scf-00054] Application unquiesce for plugin db2 failed with exit code 1, Exiting!
[Mon Jan  1 18:23:38 2018] DEBUG: Exiting with error code - 2

 

Error on SnapCreator on quiesce and unqiesce (General configuration log):

 

[Thu Jan  4 02:49:49 2018] ERROR: Command [snapcreator.exe -profile P-Landschaft -config XXX -action quiesce] failed with return code [512] and message []
[Thu Jan  4 02:49:49 2018] ERROR: [scf-00103] Running application quiesce command [snapcreator.exe -profile P-Landschaft -config XXX -action quiesce] failed with exit code [512] and message []
2 REPLIES 2

Sahana
2,845 Views

Hi,

 

Please refer the error list: https://library.netapp.com/ecmdocs/ECMP1133886/html/html/GUID-F72B896D-7857-48EF-8F32-812D66A6B7E6.html

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

Ronny_Fiebig
2,828 Views

Hello Sahana,

 

thanks for your reply.

 

I already know this error list. I also know that the error scf-00103 means that it does not find the command or runs on the wrong host.
But what does the return code 512 mean? And why are there no errors in the agent log files?

 

XXX.debug.20180104024656.agent.log
[Thu Jan  4 02:46:57 2018] INFO: Starting watchdog with [26445], forced unquiesce timeout [300] second(s)
[Thu Jan  4 02:46:57 2018] INFO: Removing log /opt/NetApp/scAgent3.6.0/logs/P-Landschaft/XXX.out.20171219024557.agent.log
[Thu Jan  4 02:46:57 2018] INFO: Removing log /opt/NetApp/scAgent3.6.0/logs/P-Landschaft/XXX.debug.20171219024557.agent.log
[Thu Jan  4 02:46:57 2018] INFO: Removing log /opt/NetApp/scAgent3.6.0/logs/P-Landschaft/XXX.stderr.20171219024557.agent.log
[Thu Jan  4 02:46:57 2018] INFO: Quiescing databases
[Thu Jan  4 02:46:57 2018] INFO: Quiescing database XXX
[Thu Jan  4 02:46:57 2018] DEBUG: Executing SQL sequence:
connect to XXX;
set write suspend for database;
connect reset;
[Thu Jan  4 02:48:49 2018] DEBUG: Command [su - db2XXX -c "/db2/db2XXX/db2_software_v105/bin/db2 -tvf /tmp/cOJLSon24O.sc"] finished with
exit code: [0]
stdout: [connect to XXX

  Database Connection Information

Database server        = DB2/LINUXX8664 10.5.7
SQL authorization ID   = DB2XXX
Local database alias   = XXX


set write suspend for database
DB20000I  The SET WRITE command completed successfully.

connect reset
DB20000I  The SQL command completed successfully.]
stderr: []
[Thu Jan  4 02:48:49 2018] INFO: Quiescing database XXX finished successfully
[Thu Jan  4 02:48:49 2018] INFO: Quiescing databases finished successfully
[Thu Jan  4 02:48:49 2018] INFO: Unquiescing databases
[Thu Jan  4 02:48:49 2018] INFO: Unquiescing database XXX
[Thu Jan  4 02:48:49 2018] DEBUG: Executing SQL sequence:
connect to XXX;
set write resume for database;
connect reset;
[Thu Jan  4 02:50:41 2018] DEBUG: Command [su - db2XXX -c "/db2/db2XXX/db2_software_v105/bin/db2 -tvf /tmp/p8MNb4yOww.sc"] finished with
exit code: [0]
stdout: [connect to XXX

  Database Connection Information

Database server        = DB2/LINUXX8664 10.5.7
SQL authorization ID   = DB2XXX
Local database alias   = XXX


set write resume for database
DB20000I  The SET WRITE command completed successfully.

connect reset
DB20000I  The SQL command completed successfully.]
stderr: []
[Thu Jan  4 02:50:41 2018] INFO: Unquiescing database XXX finished successfully
[Thu Jan  4 02:50:41 2018] INFO: Unquiescing databases finished successfully

Increasing the timeout value didn't deliver the desired success. Last night the backup was aborted again.

Public