I'm running into a strange problem with OCUM 7.2. I seem to recall a problem with 7.1 as well with something similar, the solution to that was to split up OPM into multiple instances, but that doesn't seem to be an option now.
We currently monitor 94 controller across 28 clusters, and every once in a while, OCUM seems to crash with a bunch of messages that look like this:
2017-10-15 03:08:39,475 WARN [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016038: No XAResource to recover < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff7f000101:-de8b499:59a475e9:e2b9f2, node_name=1, branch
_uid=0:ffff7f000101:-de8b499:59a475e9:e2b9fa, subordinatenodename=null, eis_name=unknown eis name
I realize this isn't much to go on but has anyone seen anything similar in their environment?
3 REPLIES 3
Thanks for your query. The error messages does not help much, but you could raise a ticket with NetApp Support with an appropriate Support bundle attached.
Taking a guess, it looks like a memory "resource" constraint. Starting UM 7.2, you have Scale Monoitor alerts that will tell you if you have any issues with your current resources and will also intimate you with possible remediation steps. Kindly look through the Event logs for any event specific to "resource utilization" and originating from source "Management Station".
Also, it would help if your system is on an independent platform (RHEL/Windows), deployed as a VM or as a physical entity.
Hope this helps!
Re: OCUM 7.2 Errors
2017-10-18 09:52 AM
Thanks very much for your response Dhiman. I haven't found support terribly useful when it comes to OCUM which is why I've come to the forums to see if I can work through the issue in a more time effective manner. We're running on CentOS which has been a bit of a sore point although I've asked our accout team to get an fPVR to get us fully supported (which seems like a silly process for problems in the application).
There are no messages from Management Station, but I've increased the memory of the VM anyway to see if that helps. I was running 4 core, 16GB of RAM and I'm now running 4 core with 24GB of RAM.
If it becomes a big problem, I'll open a case, and I'll update this thread if it keeps happening so others can have the results.
Unfortunately Support will most likely decline to work the case as you run an unsupported OS.
We could argue whether or not CentOS and RHEL are the same, but from a support perspective they are certainly not.