2012-10-11 01:38 PM
Wondering if you ever figured this out? I ask because I just ran across a situation today where I found that my WFA server (version 220.127.116.11.2 - I know I need to upgrade) hadn't updated it's cache via DFM since July 29. Here's a screenshot (the update from today was after I manually initiated and reset the interval back to 0 then back to 5):
I'm going to comb through the directory structure under the WFA install and see if I can find the log file that tracks this history. I may be able to have a process check for new entries and send an alert if one doesn't appear after a defined amount of time.
2012-10-12 10:21 AM
This is actually an important issue and I'm going to need some attention on this. Timely cache updates are vital to proper WFA operation and so I need some help troubleshooting it. I don't see any way to open a support case on this issue through typical channels because WFA isn't listed as a software option.
I need to know:
2012-10-12 11:06 AM
I have not seen an issue where the scheduled updates just stop working. I have seen issues where one cache update job is hung either due to the JVM on the WFA server (which will affect workflows executions as well) or because DFM 'went away' during a cache update (this recently happened). In both of these cases, I would see a 'long' running job in the cache results. The wfa_stats log contains information about the job execution for cache query updates. Since you haven't had that many occur since the failure, you could likely look at that log and find the last execution. See if there are any issues with 'attempts' after the last success in July. I would also look at the wfa_error log and see if anything jumps out. I would be happy to take a look as well.
Regarding, 'how do I ensure that cache updates are happening' question from the OP: I don't see any options to either schedule a cache update externally (web services) or to set up an SNMP trap on cache updates failing. I would agree with both of you that this is something that we should look at getting added as it is vital to validation.
Note: unless the second poster's issue actually had generated an error, SNMP would not have helped, per se.
2012-10-12 11:48 AM
Thank you, this is very helpful!
I'm looking at the wfa_stats logs and do see the class=CacheJobExecutorImpl entry occurring every 5 minutes which I believe is this process (I also see the 3 ETL processes logged too). I don't have a wfa_error log file anywhere but I do have an err.txt file but there's only 10 entries in total there. However, I did find WARN & ERROR entries in the server.log file that correspond exactly time wise to the last Cache job update in the wfa_stats log (2012-07-29 07:55:21)
Thanks for offering to look at these - how should I get the files to you? I don't see an option for attaching files in a private or direct email...
2012-10-12 11:54 AM
Reply to this thread and use the Advanced editor option (look at the top right of the reply box and you will see it). There is an option to attach files to your thread post. I will try to take a quick look.
2012-10-12 01:06 PM
Ah, that's it. Thank you. I've attached a zip file with the 3 logs:
wfa_stats.log = the current one that shows everything working well.
wfa_stats.log.2 = the log file that shows when the updates stopped.
server.log = Corresponding errors at the same time the updates stopped in the wfa_stats.log.2
2012-10-15 03:03 AM
It looks like the cache process actually hung. During the step to select the data from DFM, there was an error and the whole process just hung. I see multiple attempts to complete but never see it get past that point.
2012-07-29 07:55:21,443 PDT WARN [com.arjuna.ats.jta.logging.loggerI18N] (WorkManager(2)-18) [com.arjuna.ats.internal.jta.transaction.arjunacor
2012-07-29 07:55:21,443 PDT WARN [org.hibernate.util.JDBCExceptionReporter] (WorkManager(2)-18) SQL Error: 0, SQLState: null
2012-07-29 07:55:21,443 PDT ERROR [org.hibernate.util.JDBCExceptionReporter] (WorkManager(2)-18) Could not enlist in transaction on entering meta-aware object!; - nested throwable: (javax.transaction.SystemException: java.lang.Throwable: Unabled to enlist resource, see the previous warnings. tx=TransactionImple < ac, BasicAction: a838c15:f5cb:4fc391e8:b9a31d status: ActionStatus.ABORT_ONLY >); - nested throwable: (org.jboss.resource.JBossResourceException: Could not enlist in transaction on entering meta-aware object!; - nested throwable: (javax.transaction.SystemException: java.lang.Throwable: Unabled to enlist resource, see the previous warnings. tx=TransactionImple < ac, BasicAction: a838c15:f5cb:4fc391e8:b9a31d status: ActionStatus.ABORT_ONLY >))
2012-07-29 07:55:21,490 PDT ERROR [org.jboss.aspects.tx.TxPolicy] (WorkManager(2)-18) javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
2012-07-29 07:55:21,490 PDT ERROR [org.jboss.resource.adapter.jms.inflow.JmsServerSe
javax.ejb.EJBException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
2012-10-15 08:57 AM
We have logged an internal customer case and are looking at this. We will reply with our finding as soon as possible.
I have also added a feature request for getting notification based on acquisition issues (Potentially it can be no-updates in a certain
window of time, or number of failures in a certain timeframe).
Thank you for your patience.