Thursday 26 November 2009

Transaction isolation levels, consenquences and configuration

1.     Introduction
 In this post I sum up the different isolation levels and how we can configure them in JBoss AS depending on the type of datasource defined for our JEE application.


2.     JDBC transaction isolation levels
Transaction isolation levels specify what data is visible to statements within a transaction. These levels directly impact the level of concurrent access by defining what interaction is possible between transactions against the same target data source.

READ UNCOMMITTED: allows a row changed by one transaction to be read by another transaction before any changes in that row have been committed (a "dirty read"). If any of the changes are rolled back, the second transaction will have retrieved an invalid row.

READ COMMITTED: prohibits a transaction from reading a row with uncommitted changes in it. This is the default level for most of databases.

REPEATABLE READ: prohibits a transaction from reading a row with uncommitted changes in it, and it also prohibits the situation where one transaction reads a row, a second transaction alters the row, and the first transaction rereads the row, getting different values the second time (a "non-repeatable read").

SERIALIZABLE: includes the prohibitions in TRANSACTION_REPEATABLE_READ and further prohibits the situation where one transaction reads all rows that satisfy a WHERE condition, a second transaction inserts a row that satisfies that WHERE condition, and the first transaction rereads for the same condition, retrieving the additional "phantom" row in the second read.

NONE: transactions are not supported.

Depending on your isolation level we can have three phenomena:

Dirty reads

1.     Transaction A inserts a row into a table.
2.     Transaction B reads the new row.
3.     Transaction A rolls back.
4.     Transaction B may have done work to the system based on the row inserted by transaction A, but that row never became a permanent part of the database.

Nonrepeatable reads

1.     Transaction A reads a row.
2.     Transaction B changes the row.
3.     Transaction A reads the same row a second time and gets the new results.

Phantom reads

1.     Transaction A reads all rows that satisfy a WHERE clause on an SQL query.
2.     Transaction B inserts an additional row that satisfies the WHERE clause.
3.     Transaction A re-evaluates the WHERE condition and picks up the additional row.


Isolation Level
Dirty Read
Nonrepeatable Read
Phantom Read
READ UNCOMMITTED
Permitted
Permitted
Permitted
READ COMMITTED
No Permitted
Permitted
Permitted
REPEATABLE READ
No Permitted
No Permitted
Permitted
SERIALIZABLE
No Permitted
No Permitted
No Permitted



3.     How to set transaction isolation level in JBoss AS 4.2.2
 As far as I know we can configure it within server/my_server/deploy/*-ds.xml (i.e oracle-ds.xml, mysql-ds.xml).

There are two types of datasources where we can define it:

local-tx-datasource: identifies a datasource that uses single phase commit transactions, therefore there is just one resource per JTA transaction (local transaction).

xa-tx-datasource: identifies a datasource that uses two phase commit transactions, therefore there are more than one resource per JTA transaction (global transaction).

Note: Use xa-tx-datasource when it is really needed. It is expensive in terms of performance.

The element to set is  

transaction-isolation: specifies the java.sql.Connection transaction isolation level to use. The constants defined in the Connection interface are the possible element content values and include:

TRANSACTION_READ_UNCOMMITTED
            TRANSACTION_READ_COMMITTED
TRANSACTION_REPEATABLE_READ
TRANSACTION_SERIALIZABLE
TRANSACTION_NONE

We can find further information in this link JBoss web site

12 comments:

  1. Depending on the database, there are some additions.

    Oracle supports another isolation level, called readonly. It provides the same semantics as the serialized isolation level but doesn't allow for updates or inserts. I guess the same goes for MySQL + InnoDB and postgresql since they all make use of MVCC (Multiverse Concurrency Control).

    And with the SERIALIZED isolation level, you would expect that you get a behavior that can only be explained by a serialized execution if transactions, in Oracle this doesn't have to be the case.

    If you look for 'Breaking Oracle SERIALIZABLE' you will find a post explaining how this can happen.

    And using Oracle (and Postgresql/MySQL + InnoDb) not all isolation levels are supported (for example dirty read and repeatable read) and are automatically upgraded to a higher one.

    In a lot of situations isolation levels need to be combined with custom locking like the 'select for update' to get the behaviour you really want to have.

    ReplyDelete
  2. Good point.
    The isolation levels depending on the database that we use some of them are or not supported.

    ReplyDelete
  3. In the next post, I want to comment the optimistic and pesimistic locking.

    ReplyDelete
  4. In most cases (all?) they are automatically upgraded, so nothing to worry about I think.

    ReplyDelete
  5. Cool, I'll make sure to read it.

    Concurrency control is one of my favourite subjects and how it is implemented and databases or how it should be used on databases on application level is very challenging.

    ReplyDelete
  6. I would really appreciate all your comments about that. I recognize that I have a lack on that.

    I believe that the 90% jee applications nowadays dont take into account this issues due to the ignorance of the architects.

    Do you think that it could be good idea talk about optimisic / pesimistic and how JPA face the issue with @Version annotation?.

    ReplyDelete
  7. Good post, I always forget what these mean :-) Regarding locking, make sure to mention row and table locking differences (e.g. MySQL MyISAM vs InnoDB engines)

    ReplyDelete
  8. The general approach a lot of architects take is:
    - my middleware is going to take care of it
    - if I can't see it, it doesn't exist.

    In the beginning I felt ashamed that I didn't understand how it works.. But after some time I realized that almost nobody understands it. It is very complex not only because concurrency is complex, but also because one needs to know the implementation specifics of the database; what works well in one, doesn't work well in another.

    Only when things go wrong, some ad hoc solation is added.

    ReplyDelete
  9. I agree that each database has its own isolation level and the way to work with it.

    But on the hand if we need to control somehow the concurrency we need to use some approach to avoid
    dirty reads, nonrepeatable reads and phantom reads. I am not expert but the middleware or whatever we use do 100% for us?.

    ReplyDelete
  10. Ping back
    http://www.sqlfundas.com/post/2009/11/26/Database-Lock-Block-Dead-Lock-e28093-What-is-it-Why-is-it-What-to-do-about-it-e28093-Part-1.aspx

    ReplyDelete
  11. I think oracle is the only database which has a concept of mutiversion read concurrency.This means oracle doesn't lock writers while reading and viceversa. A select * in oracle will not place any locks to other transactions updating or reading from same table in READ_COMMITED isolation level but all other db's will place locks in this situation.

    ReplyDelete
  12. How do i set Transaction Isolation level in JBoss EJB CMP and BMP ?

    ReplyDelete