Data Backup and Recovery
Data Backup and Recovery
after upgrade to SC 4.1P1 we are experiencing intermittent (once a week) SnapCreator crashes (service seems to run but is unresponsive, WebGUI is not available, running BackupJobs are interrupted).
I can see the following records in the sc_server.log:
...
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source) ~[na:1.7.0_09]
at java.lang.String.<init>(Unknown Source) ~[na:1.7.0_09]
at java.io.Win32FileSystem.resolve(Unknown Source) ~[na:1.7.0_09]
at java.io.File.<init>(Unknown Source) ~[na:1.7.0_09]
at java.io.File.listFiles(Unknown Source) ~[na:1.7.0_09]
at com.netapp.snapcreator.workflow.task.DeleteLogTask.getLogTimeList(DeleteLogTask.java:87) ~[workflow.jar:na]
at com.netapp.snapcreator.workflow.task.DeleteLogTask.execute(DeleteLogTask.java:45) ~[workflow.jar:na]
at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:48) [workflow.jar:na]
at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:20) [workflow.jar:na]
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [na:1.7.0_09]
at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.7.0_09]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_09]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_09]
at java.lang.Thread.run(Unknown Source) [na:1.7.0_09]
2013-10-11 00:29:51,044 [pool-2-thread-6702] com.netapp.snapcreator.workflow.impl.TaskExecutor ERROR - Exception thrown by task:ossvSnapVault with config:TVN@TRRDSH01_OSSV_daily
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source) ~[na:1.7.0_09]
at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) ~[na:1.7.0_09]
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source) ~[na:1.7.0_09]
at java.lang.AbstractStringBuilder.append(Unknown Source) ~[na:1.7.0_09]
at java.lang.StringBuilder.append(Unknown Source) ~[na:1.7.0_09]
at java.lang.StringBuilder.append(Unknown Source) ~[na:1.7.0_09]
at com.netapp.snapcreator.repository.config.util.DocUtils.createDoc(DocUtils.java:161) ~[repository.jar:na]
at com.netapp.snapcreator.repository.config.util.Parser.parseConfig(Parser.java:99) ~[repository.jar:na]
at com.netapp.snapcreator.workflow.task.OssvSnapVault.executeCommmand(OssvSnapVault.java:556) ~[workflow.jar:na]
at com.netapp.snapcreator.workflow.task.OssvSnapVault.updateSnapVaultStatus(OssvSnapVault.java:349) ~[workflow.jar:na]
at com.netapp.snapcreator.workflow.task.OssvSnapVault.snapVaultWait(OssvSnapVault.java:314) ~[workflow.jar:na]
at com.netapp.snapcreator.workflow.task.OssvSnapVault.execute(OssvSnapVault.java:84) ~[workflow.jar:na]
at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:48) [workflow.jar:na]
at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:20) [workflow.jar:na]
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [na:1.7.0_09]
at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.7.0_09]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_09]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_09]
at java.lang.Thread.run(Unknown Source) [na:1.7.0_09]
...
Can you help?
Current workaround: Service restart.
BR Stephan
Hi,
These are the default memory options used when starting Snap Creator.
java -Xms128m -XX:MaxPermSize=256m -jar snapcreator.jar -port <port>
Instead of using the Windows service you can start the Server manually:
1) move to the engine directory as admin, e.g. C:\Program Files\NetApp\Snap_Creator_Framework\scServer4.0.0\engine
2) run the following command with 256 mb :
java -Xms256m -XX:MaxPermSize=256m -jar snapcreator.jar -port <port>
If you still face memory issues you can increase it to 512.
Thanks,
Kapil
Hello Kapil,
unfortunately we are still experiencing the java.lang.OutOfMemoryError . Even after setting the memory size to 512.
Our startup string looks like java -Xms256m -Xms1024m -XX:MaxPermSize=512m -jar snapcreator.jar -port 8081
After increasing it to 512 it now took 3 weeks, before the process crashed again.
Is there maybe a memory leak?
I attached the sc_server.log for further investigation.
Hi Stephan,
From my initial analysis I see lots of following statements in the logs:
2013-12-18 19:20:40,511 [pool-2-thread-11840] org.apache.cxf.phase.PhaseInterceptorChain WARN - Interceptor for {http://www.netapp.com/SnapCreator/Daemon/Agent}AgentService#{http://www.netapp.com/SnapCreator/Daemon/Agent}version has thrown exception, unwinding now
org.apache.cxf.interceptor.Fault: Connection timed out: connect
I think one or more of your agents are not reachable or you have stale entries.
Please can you delete those configs/agent entries and make sure all agents present in sc configs on the server are reachable and working.
Once this is done restart the Snap Creator server and let us know if this fixes the problem for you.
We will definitely look deeper into this and figure out if there is a memory leak in the code.
Thanks,
Kapil
Hello Kapil,
there are no items in Agent Monitor anymore (even after a refresh), although I know, that there were items at the beginning.
Backups are working fine, so I assume, that connectivity to the agents work.
I even did a telnet to the SC Agents on the machines to see if there are connectivity issues.
BR Stephan
It looks like you specified Xms twice (not sure which java picks, but this most likely isn't the issue):
java -Xms256m -Xms1024m -XX:MaxPermSize=512m -jar snapcreator.jar -port 8081
Better:
java -Xms1024m -XX:MaxPermSize=512m -jar snapcreator.jar -port 8081
Can you maybe give some insight if the memory consumption of the server grows over time? I would assume so, correct?
Thanks,
Clemens
I've adjusted the parameter as proposed and restarted the SC process.
Memory consumption grows over time, as you said.
SC 4.1 GA has been released, I think it would be worth a try and check if this is gone in 4.1.
Thanks,
Clemens