Community

Subscribe
Highlighted

Monitoring Cache Status or Lag ?

Hi,

Is it possible to monitor the status of the datasoruce acquisition and then also to alert should the acquistion fail or fail behind a specific lag ??

Thanks
Rich

Re: Monitoring Cache Status or Lag ?

Wondering if you ever figured this out?  I ask because I just ran across a situation today where I found that my WFA server (version 1.1.0.53.2 - I know I need to upgrade) hadn't updated it's cache via DFM since July 29.  Here's a screenshot (the update from today was after I manually initiated and reset the interval back to 0 then back to 5):

I'm going to comb through the directory structure under the WFA install and see if I can find the log file that tracks this history.  I may be able to have a process check for new entries and send an alert if one doesn't appear after a defined amount of time.

-Dave

Re: Monitoring Cache Status or Lag ?

This is actually an important issue and I'm going to need some attention on this.  Timely cache updates are vital to proper WFA operation and so I need some help troubleshooting it.  I don't see any way to open a support case on this issue through typical channels because WFA isn't listed as a software option.

I need to know:

  • Why these cache updates just stopped happening
  • How to prevent it from happening again
  • How to monitor cache updates (log file that tracks this, the specific process that performs this, etc.)

Thank you.

Re: Monitoring Cache Status or Lag ?

I have not seen an issue where the scheduled updates just stop working.  I have seen issues where one cache update job is hung either due to the JVM on the WFA server (which will affect workflows executions as well) or because DFM 'went away' during a cache update (this recently happened).  In both of these cases, I would see a 'long' running job in the cache results.  The wfa_stats log contains information about the job execution for cache query updates.  Since you haven't had that many occur since the failure, you could likely look at that log and find the last execution.  See if there are any issues with 'attempts' after the last success in July.  I would also look at the wfa_error log and see if anything jumps out.  I would be happy to take a look as well.

Regarding, 'how do I ensure that cache updates are happening' question from the OP: I don't see any options to either schedule a cache update externally (web services) or to set up an SNMP trap on cache updates failing.  I would agree with both of you that this is something that we should look at getting added as it is vital to validation. 

Note: unless the second poster's issue actually had generated an error, SNMP would not have helped, per se.

Re: Monitoring Cache Status or Lag ?

Thank you, this is very helpful!

I'm looking at the wfa_stats logs and do see the class=CacheJobExecutorImpl entry occurring every 5 minutes which I believe is this process (I also see the 3 ETL processes logged too).  I don't have a wfa_error log file anywhere but I do have an err.txt file but there's only 10 entries in total there.  However, I did find WARN & ERROR entries in the server.log file that correspond exactly time wise to the last Cache job update in the wfa_stats log (2012-07-29 07:55:21)

Thanks for offering to look at these - how should I get the files to you?  I don't see an option for attaching files in a private or direct email...

Re: Monitoring Cache Status or Lag ?

Reply to this thread and use the Advanced editor option (look at the top right of the reply box and you will see it).  There is an option to attach files to your thread post.  I will try to take a quick look. 

Re: Monitoring Cache Status or Lag ?

Ah, that's it.  Thank you.  I've attached a zip file with the 3 logs:

wfa_stats.log = the current one that shows everything working well.

wfa_stats.log.2 = the log file that shows when the updates stopped.

server.log = Corresponding errors at the same time the updates stopped in the wfa_stats.log.2

Thanks!

Re: Monitoring Cache Status or Lag ?

It looks like the cache process actually hung.  During the step to select the data from DFM, there was an error and the whole process just hung.  I see multiple attempts to complete but never see it get past that point.

2012-07-29 07:55:21,443 PDT WARN  [com.arjuna.ats.jta.logging.loggerI18N] (WorkManager(2)-18) [com.arjuna.ats.internal.jta.transaction.arjunacore.lastResource.disallow] [com.arjuna.ats.internal.jta.transaction.arjunacore.lastResource.disallow] Adding multiple last resources is disallowed. Current resource is org.jboss.resource.connectionmanager.TxConnectionManager$LocalXAResource@43c709

2012-07-29 07:55:21,443 PDT WARN  [org.hibernate.util.JDBCExceptionReporter] (WorkManager(2)-18) SQL Error: 0, SQLState: null

2012-07-29 07:55:21,443 PDT ERROR [org.hibernate.util.JDBCExceptionReporter] (WorkManager(2)-18) Could not enlist in transaction on entering meta-aware object!; - nested throwable: (javax.transaction.SystemException: java.lang.Throwable: Unabled to enlist resource, see the previous warnings. tx=TransactionImple < ac, BasicAction: a838c15:f5cb:4fc391e8:b9a31d status: ActionStatus.ABORT_ONLY >); - nested throwable: (org.jboss.resource.JBossResourceException: Could not enlist in transaction on entering meta-aware object!; - nested throwable: (javax.transaction.SystemException: java.lang.Throwable: Unabled to enlist resource, see the previous warnings. tx=TransactionImple < ac, BasicAction: a838c15:f5cb:4fc391e8:b9a31d status: ActionStatus.ABORT_ONLY >))

2012-07-29 07:55:21,490 PDT ERROR [org.jboss.aspects.tx.TxPolicy] (WorkManager(2)-18) javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection

2012-07-29 07:55:21,490 PDT ERROR [org.jboss.resource.adapter.jms.inflow.JmsServerSession] (WorkManager(2)-18) Unexpected error delivering message delegator->JBossMessage[22013111733039424]Smiley TongueERSISTENT, deliveryId=11217

javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection

Re: Monitoring Cache Status or Lag ?

Hi Dave,

We have logged an internal customer case and are looking at this. We will reply with our finding as soon as possible.

I have also added a feature request for getting notification based on acquisition issues (Potentially it can be no-updates in a certain

window of time, or number of failures in a certain timeframe).

Thank you for your patience.

Yaron Haimsohn

WFA team

Re: Monitoring Cache Status or Lag ?

Thanks guys, much appreciated!