Saturday, November 22, 2008

Transaction Timeouts and EJB3/JPA

I recently discovered somewhat odd behavior in the way transactions are timed out in my EJB3/JPA app. It was surprising to me how it worked, and though I guess it makes sense when thinking about it, something here still doesn't "feel right".

Consider an EJB3/JPA application deployed in JBossAS 4.2 that uses the Arjuna's Transaction Manager (aka JBossTM - although I still call it Arjuna :) and Hibernate as the JPA implementation.

Annotating an EJB3 stateless session bean (SLSB) method, is this:

@org.jboss.annotation.ejb.TransactionTimeout(60)

If my method takes longer than 60 seconds, the transaction it is running in will timeout, and thus rollback. But, what happens if my method is executing a very long running SQL update? Or, maybe my method is simply writing out a very long data file to a remote file system. Or, maybe I'm just stupid and my method is executing "Thread.sleep(30 * 60 * 1000)"? In other words, what happens if my method takes longer than this transaction timeout?

What I was assuming would happen is the thread running my method would get interrupted, and (assuming the JDBC driver or file IO subsystem or Thread.sleep or whatever my method was doing at the time can handle the interrupt) the method would abort with that interrupted exception and immediately begin the rollback procedures. It would do this because once the method exits, the EJB3 interceptor chain would begin to unroll and eventually hit the transaction manager interceptors whose job it is to abort the transaction.

That is close to what happens, but not exactly.

It turns out, the transaction manager detects that the transaction has timed out and aborts it from within an asynchronous thread (different from the thread running my method), but it never sends an interrupt to my method. So, my method continues on even after the timeout period and only when it returns will that interceptor chain finally get a chance to abort the transaction. But by this time, its already been aborted! The interceptor kindly tells me this in a log message, but it's late to the party - the transaction has already been rolled back. This doesn't actually cause any harm because the transaction manager simply says, "this has already been aborted and the rollback was performed earlier, I'll just log a warning and skip the abort procedure".

But, I question why my method was not given a chance to abort also. It is possible that my method could run for 60 minutes, do everything completely successfully, yet, because it exceeded the transaction timeout the entire transaction was rolled back and that 60 minutes worth of work now becomes wasted.

Of course, the answer to that would be, "just set your transaction timeout to a higher value". But that's not really my point. My point is, why is my thread even allowed to waste its time when the transaction manager knows the timeout has expired and the transaction has been rolled back? I think it should at least attempt to warn the thread about the situation by sending an interrupt to it (at that point, it would be my method's job to handle the thrown InterruptedException).

I've tested this to see if an interrupt is sent and I don't see it. See below for the log messages I see after running some test code. I had an SLSB method annotated with a transaction timeout of 2 seconds, and in my SLSB method, I block within a "Thread.sleep(10000)" call (so it pauses in my method for 10 seconds). Take note of the log timestamps and notice that my sleep is allowed to return normally after 10 seconds, but after the 2nd second, you see the transaction manager aborted my transaction! Therefore, my method was allowed to continue until it returned normally - at which time, the interceptor chain tried to abort the transaction a second time:


2008-11-23 01:09:33 INFO [STDOUT] !!!!!!!!!BEFORE SLEEP
2008-11-23 01:09:35 WARN [com.arjuna.ats.arjuna.logging.arjLoggerI18N]
[com.arjuna.ats.arjuna.coordinator.BasicAction_58] - Abort of action
id a0b0c21:ab6:4928f32d:1225 invoked while multiple threads active within it.
2008-11-23 01:09:35 WARN [com.arjuna.ats.arjuna.logging.arjLoggerI18N]
[com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check
- atomic action a0b0c21:ab6:4928f32d:1225 aborting with 1 threads active!
2008-11-23 01:09:43 INFO [STDOUT] !!!!!!!!!AFTER SLEEP
2008-11-23 01:09:43 WARN [com.arjuna.ats.arjuna.logging.arjLoggerI18N]
[com.arjuna.ats.arjuna.coordinator.BasicAction_40] - Abort called on
already aborted atomic action a0b0c21:ab6:4928f32d:1225
2008-11-23 01:09:43 ERROR [test.slsb] Failed. Cause:
java.lang.IllegalStateException: [com.arjuna.ats.internal.jta.transaction.arjunacore.inactive]
[com.arjuna.ats.internal.jta.transaction.arjunacore.inactive] The transaction is not active!
java.lang.IllegalStateException: [com.arjuna.ats.internal.jta.transaction.arjunacore.inactive]
[com.arjuna.ats.internal.jta.transaction.arjunacore.inactive] The transaction is not active!
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1372)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
at org.jboss.aspects.tx.TxPolicy.endTransaction(TxPolicy.java:175)
at org.jboss.aspects.tx.TxPolicy.invokeInOurTx(TxPolicy.java:87)
at org.jboss.aspects.tx.TxInterceptor$RequiresNew.invoke(TxInterceptor.java:262)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.aspects.tx.TxPropagationInterceptor.invoke(TxPropagationInterceptor.java:76)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.ejb3.stateless.StatelessInstanceInterceptor.invoke(StatelessInstanceInterceptor.java:62)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.aspects.security.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:77)
at org.jboss.ejb3.security.Ejb3AuthenticationInterceptor.invoke(Ejb3AuthenticationInterceptor.java:106)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.ejb3.ENCPropagationInterceptor.invoke(ENCPropagationInterceptor.java:46)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.ejb3.asynchronous.AsynchronousInterceptor.invoke(AsynchronousInterceptor.java:106)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:214)
at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:184)
at org.jboss.ejb3.stateless.StatelessLocalProxy.invoke(StatelessLocalProxy.java:81)
...


It is also interesting to note that the transaction manager knows that a thread is still running my method - see its log message: "aborting with 1 threads active".

This is worrisome to me because I could run into a condition where my SLSB methods might waste a whole lot of time processing a query or doing other work well after my timeout has triggered the rollback (unless I set my timeouts to some really high value that I know will never expire but then, what's the point of the timeout?). I would rather have my method be interrupted so I don't waste the system resources doing work that is just going to be (or has already been) rolled back. But, for now, without any other options that I know of, I am just going to have to make sure I set my timeouts to some very large values for those methods that may (but most times may not) take a really long time to complete.

I wonder if there is some configuration property I can set on the transaction manager to tell it, "interrupt any active thread that may be running within a transaction that has timed out". I don't know of any such thing, but it just seems like this would be an obvious thing to want to do, so I would not be surprised at all if it turns out that I am just missing a piece of the puzzle. Feel free to comment on this blog if you know of a way to work around this issue or to let me know if there is something I am misunderstanding with respect to the way the transaction manager works (or can be configured to work).

As a side note, I was hoping I could prevent doing the extra, wasted work in the database by circumventing the JPA entity manager and using JDBC statements directly, thereby having access to the java.sql.Statement.setQueryTimeout API but, alas, the Postgres driver does not yet implement that.

15 comments:

  1. If you check arjuna documentation "Checked transaction behaviour" part at http://www.jboss.org/jbosstm/docs/4.2.3/manuals/html/jts/ProgrammersGuide.html#_Toc163466659
    you will see that they do not do anything special sue to multi-thread support. A little bit below they explain Jboss TS specifics:
    "Where support from the ORB is available, JBossTS supports X/Open checked transaction behaviour. However, unless the OTS_CHECKED_TRANSACTIONS property variable is set to YES this is disabled by default."
    and also
    "In a multi-threaded application, multiple threads may be associated with a transaction during its lifetime, i.e., the thread�s share the context. In addition, it is possible that if one thread terminates a transaction other threads may still be active within it. In a distributed environment, it can be difficult to guarantee that all threads have finished with a transaction when it is terminated. By default, JBossTS will issue a warning if a thread terminates a transaction when other threads are still active within it; however, it will allow the transaction termination to continue. Other solutions to this problem are possible, e.g., blocking the thread which is terminating the transaction until all other threads have disassociated themselves from the transaction context. Therefore, JBossTS provides the com.arjuna.ats.arjuna.coordinator.CheckedAction class, which allows the thread/transaction termination policy to be overridden. Each transaction has an instance of this class associated with it, and application programmers can provide their own implementations on a per transaction basis.

    public class CheckedAction
    {
    public CheckedAction ();

    public synchronized void check (boolean isCommit, Uid actUid,
    BasicList list);
    };

    When a thread attempts to terminate the transaction and there are active threads within it, the system will invoke the check method on the transaction�s CheckedAction object. The parameters to the check method are:

    � isCommit: indicates whether the transaction is in the process of committing or rolling back.

    � actUid: the transaction identifier.

    � list: a list of all of the threads currently marked as active within this transaction.

    When check returns, the transaction termination will continue. Obviously the state of the transaction at this point may be different from that when check was called, e.g., the transaction may subsequently have been committed.

    The CheckedAction instance associated with a given transaction is set using the setCheckedAction method of Current."

    ReplyDelete
  2. Thanks for the info! As promised, I really am not surprised that Arjuna has the ability to do this kind of thing. It's just hard to find it in the docs :)

    Alot of the TM docs I've found (the link you provided included) deal with the JBossTM/Arjuna as a separate product. For those people dealing with the embedded transaction manager co-located in the JBossAS app server, we need more docs related to that integration, especially with respect to configuration settings and extension points. Specifically, their docs here state, "The CheckedAction instance associated with a given transaction is set using the setCheckedAction method of Current." This probably makes sense to the person using the JBossTM API directly (e.g. the developers of JBossAS itself) - but I have no idea how this relates to my EJB3 app running inside of JBossAS. For example, how do I install my own implementation of CheckedAction inside of JBossAS such that it gets installed for those transactions whose threads I want to be interrupted (which may or may not be everywhere - there may only be one or a few SLSB methods in which I want this behavior)? I looked in conf/jbossjta-properties.xml but I see no mention of CheckedAction or related configuration.

    ReplyDelete
  3. Following up on Ramazanyich's info. I found what I want to do to be very easy. Just write a simple EJB3 interceptor - all it needs to do is call the static method:

    com.arjuna.ats.arjuna.coordinator.
    BasicAction.Current()

    to get the thread's current transaction. If Current() returns non-null, call setCheckedAction() on it, passing in an object of type "public class MyCheckedAction extends CheckedAction" where that object's check() method simply loops over the passed in list of threads and calls "interrupt()" on them before calling super.check().

    That's really all it is. Very small amount of code and it works like a charm - when a transaction is timed out, my SLSB method is notified via an InterruptedException. At that point, its up to my method to handle it, but at least I was notified that the transaction has timed out.

    ReplyDelete
  4. Here's the source code for an EJB3 interceptor that can set up a JBossTM CheckedAction class to interrupt threads on tx timeout. I added better logging, too - so you have the option to log the full stack trace of the active thread whose transaction timed out (this better logging occurs, even if you don't want to interrupt the thread).

    ReplyDelete
  5. Stumbled upon this today. Your TransactionInterruptInterceptor is interesting and helpful, but I am not sure how exactly calling interrupt on the transaction thread notifies your EJB. When I tried this (my EJB is plain, container-managed transaction by default since I am not specifying any transaction annotations), the interceptor called interrupt on the thread (since my EJB is annotated with @InterruptOnTransactionTimeout(value=true)), but nothing actually happened.

    Is there something else that needs to be done in your EJB to make it aware of thread-interrupts?

    (Using JBoss 4.2.2GA, EJB3)

    ReplyDelete
  6. kmudrick - you have to write your business method to handle the interrupt. You need to either check Thread.isInterrupted or if you are calling something that throws InterruptedException you must catch it and abort.

    Otherwise, the thread will get its interrupt flag set, but it just doesn't do anything with it. Its up to the thread to do something with it.

    ReplyDelete
  7. Ahh but what if my business method times out in the middle of a long running blocking call (say.. first call to ResultSet.next on an expensive sql query)?

    I can check Thread.interrupted() after that blocking call (and it will return true of course in this instance.) But I still had to sit there waiting for that call to complete, even if the transaction timed out long before. Since in this instance I am just doing your typical sql query, I'm not calling anything that currently explicitly throws InterruptedException.

    Hopefully that made sense. I'm mostly concerned with being able to give up on my potentially resource-intensive method as soon as possible.

    ReplyDelete
  8. > "I'm mostly concerned with being able to give up on my potentially resource-intensive method as soon as possible"

    You and me both! :) If you find a good solution, let me know!

    ReplyDelete
  9. In jboss-service.xml , for the mbean
    org.jboss.tm.TransactionManagerService , set the InterruptThreads attribute to true. Beware this is a global settings.

    ReplyDelete
  10. Vinit,

    That setting does not apply to recent versions of JBoss AS (e.g. 4.2.x, 5).

    ReplyDelete
  11. Very interesting. How should I go about installing this CheckecAction in all my SLSB? Is there another way other that adding the annotation to every single one? I am interested in getting a stack trace, to figure out what was method was taking too long.

    Thanks.

    ReplyDelete
  12. Alex - in ejb-jar.xml file, we have this:

    <interceptor-binding>
    <ejb-name>*</ejb-name>
    <interceptor-class>org.rhq.enterprise.server.common.TransactionInterruptInterceptor</interceptor-class>
    </interceptor-binding>

    Thus all our EJB SLSBs get this interceptor injected without needing to annotate all of them individually. The CheckedAction class is defined in that class TransactionInterruptInterceptor - click that link for its source.

    ReplyDelete
  13. Thank you very much John. Will give it a try now.

    ReplyDelete
  14. Thanks for this article and comments, very helpful.

    If your database driver supports it, you could use PreparedStatement.setQueryTimeout(seconds) before executing statement that can be slow, then the thread won't block for much longer than seconds (some overhead is to be expected according to http://forums.oracle.com/forums/thread.jspa?threadID=550257 )

    But this will only work on calls throught PreparedStatement (for me it's mainly plsql procedures, that are slow).

    ReplyDelete
  15. Lo bueno de que no se pare es que puedes determinar el tiempo que toma y configurar el timeout

    ReplyDelete