Data Protection

we are experiencieng intermittent SC crashes (Error: java.lang.OutOfMemoryError: Java heap space)

REISTTELECOM
6,574 Views

after upgrade to SC 4.1P1 we are experiencing intermittent (once a week) SnapCreator crashes (service seems to run but is unresponsive, WebGUI is not available, running BackupJobs are interrupted).

I can see the following records in the sc_server.log:

...

java.lang.OutOfMemoryError: Java heap space

    at java.util.Arrays.copyOf(Unknown Source) ~[na:1.7.0_09]

    at java.lang.String.<init>(Unknown Source) ~[na:1.7.0_09]

    at java.io.Win32FileSystem.resolve(Unknown Source) ~[na:1.7.0_09]

    at java.io.File.<init>(Unknown Source) ~[na:1.7.0_09]

    at java.io.File.listFiles(Unknown Source) ~[na:1.7.0_09]

    at com.netapp.snapcreator.workflow.task.DeleteLogTask.getLogTimeList(DeleteLogTask.java:87) ~[workflow.jar:na]

    at com.netapp.snapcreator.workflow.task.DeleteLogTask.execute(DeleteLogTask.java:45) ~[workflow.jar:na]

    at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:48) [workflow.jar:na]

    at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:20) [workflow.jar:na]

    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [na:1.7.0_09]

    at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.7.0_09]

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_09]

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_09]

    at java.lang.Thread.run(Unknown Source) [na:1.7.0_09]

2013-10-11 00:29:51,044 [pool-2-thread-6702] com.netapp.snapcreator.workflow.impl.TaskExecutor ERROR - Exception thrown by task:ossvSnapVault with config:TVN@TRRDSH01_OSSV_daily

java.lang.OutOfMemoryError: Java heap space

    at java.util.Arrays.copyOf(Unknown Source) ~[na:1.7.0_09]

    at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) ~[na:1.7.0_09]

    at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source) ~[na:1.7.0_09]

    at java.lang.AbstractStringBuilder.append(Unknown Source) ~[na:1.7.0_09]

    at java.lang.StringBuilder.append(Unknown Source) ~[na:1.7.0_09]

    at java.lang.StringBuilder.append(Unknown Source) ~[na:1.7.0_09]

    at com.netapp.snapcreator.repository.config.util.DocUtils.createDoc(DocUtils.java:161) ~[repository.jar:na]

    at com.netapp.snapcreator.repository.config.util.Parser.parseConfig(Parser.java:99) ~[repository.jar:na]

    at com.netapp.snapcreator.workflow.task.OssvSnapVault.executeCommmand(OssvSnapVault.java:556) ~[workflow.jar:na]

    at com.netapp.snapcreator.workflow.task.OssvSnapVault.updateSnapVaultStatus(OssvSnapVault.java:349) ~[workflow.jar:na]

    at com.netapp.snapcreator.workflow.task.OssvSnapVault.snapVaultWait(OssvSnapVault.java:314) ~[workflow.jar:na]

    at com.netapp.snapcreator.workflow.task.OssvSnapVault.execute(OssvSnapVault.java:84) ~[workflow.jar:na]

    at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:48) [workflow.jar:na]

    at com.netapp.snapcreator.workflow.impl.SCTaskCallable.call(SCTaskCallable.java:20) [workflow.jar:na]

    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [na:1.7.0_09]

    at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.7.0_09]

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_09]

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_09]

    at java.lang.Thread.run(Unknown Source) [na:1.7.0_09]

...

Can you help?

Current workaround: Service restart.

BR Stephan

7 REPLIES 7

Arora_Kapil
6,574 Views

Hi,

These are the default memory options used when starting Snap Creator.

java -Xms128m -XX:MaxPermSize=256m -jar snapcreator.jar -port <port>

Instead of using the Windows service you can start the Server manually:

1) move to the engine directory as admin, e.g. C:\Program Files\NetApp\Snap_Creator_Framework\scServer4.0.0\engine

2) run the following command with 256 mb :

java -Xms256m -XX:MaxPermSize=256m -jar snapcreator.jar -port <port>

If you still face memory issues you can increase it to 512.

Thanks,

Kapil

DARTHVADERNW
6,574 Views

Hello Kapil,

unfortunately we are still experiencing the java.lang.OutOfMemoryError . Even after setting the memory size to 512.

Our startup string looks like java -Xms256m -Xms1024m -XX:MaxPermSize=512m -jar snapcreator.jar -port 8081

After increasing it to 512 it now took 3 weeks, before the process crashed again.

Is there maybe a memory leak?

I attached the sc_server.log for further investigation.

Arora_Kapil
6,574 Views

Hi Stephan,

From my initial analysis I see lots of following statements in the logs:

2013-12-18 19:20:40,511 [pool-2-thread-11840] org.apache.cxf.phase.PhaseInterceptorChain WARN  - Interceptor for {http://www.netapp.com/SnapCreator/Daemon/Agent}AgentService#{http://www.netapp.com/SnapCreator/Daemon/Agent}version has thrown exception, unwinding now

org.apache.cxf.interceptor.Fault: Connection timed out: connect

I think one or more of your agents are not reachable or you have stale entries.

Please can you delete those configs/agent entries and make sure all agents present in sc configs on the server are reachable and working.

Once this is done restart the Snap Creator server and let us know if this fixes the problem for you.

We will definitely look deeper into this and figure out if there is a memory leak in the code.

Thanks,

Kapil

DARTHVADERNW
6,574 Views

Hello Kapil,

there are no items in Agent Monitor anymore (even after a refresh), although I know, that there were items at the beginning.

Backups are working fine, so I assume, that connectivity to the agents work.

I even did a telnet to the SC Agents on the machines to see if there are connectivity issues.

BR Stephan

Clemens_Siebler
6,574 Views

It looks like you specified Xms twice (not sure which java picks, but this most likely isn't the issue):

java -Xms256m -Xms1024m -XX:MaxPermSize=512m -jar snapcreator.jar -port 8081

Better:

java -Xms1024m -XX:MaxPermSize=512m -jar snapcreator.jar -port 8081

Can you maybe give some insight if the memory consumption of the server grows over time? I would assume so, correct?

Thanks,

Clemens

DARTHVADERNW
6,574 Views

I've adjusted the parameter as proposed and restarted the SC process.

Memory consumption grows over time, as you said.

Clemens_Siebler
6,574 Views

SC 4.1 GA has been released, I think it would be worth a try and check if this is gone in 4.1.

Thanks,

Clemens

Public