Small data: workflow, long transactions and DB2

Gerbrand van Dieijen

With all the big-data postings, now something about traditional SQL, running at DB2 9. We had created a web-application, that was basically a view for a database. The application displayed content of a database after the user would enter search parameters, or everything when no parameter was entered. The database contained a few hundred-thousands records. All in all very simple.
The software worked fine at our test-environment. But when we deployed the software into production, it would hardly ever display data. In the logs we found frequent locking errors as follow:

15:23 ERROR (org.hibernate.engine.jdbc.spi.SqlExceptionHelper) - DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=68, DRIVER=4.1

Searching the web revealed what that error meant: a deadlock or timeout has occurred. We knew the same database was also accessed by another application, Activiti running processes written in BPMN. The processes in the Activiti-engine only inserted a few records per minute, so we didn't expect that application having a big influence. Yet how can a few inserts per minute have such a big impact?
After some talking to a DBA'er and doing our own experimentation we found out about the problem, which will describe here.

As said the other application was an Activiti that executed processes (to be exact: process-definitions in BPMN). At several steps in the process, Java code was executed by the Activiti's delegate mechanism. That java code would insert data into our database.

Schermafbeelding 2013-02-12 om 16.45.21

A DB2-DBA'er told us, when you'd insert a new record during a database-transaction, that record was locked until you would release the transaction by issuing a commit or rollback. Of course we already knew that. However, he also told use that other transactions could still be influenced by that lock: a SELECT without or with broad WHERE clause would have to wait until the first transaction finished. The reason for that: DB2 not just locks the record, but a part of the tablespace where that record happens to be inserted into.

While inserting that data would take very little time, the execution of the whole process-execution could take 5 to 20 seconds. The reason for the long duration: part of the steps involved invoking other webservices, that were not always quick to respond. Since only at the end of the process a commit was performed, a lock existed for part of the database for up to 20 seconds. So this means our console couldn't return any data if the other process was still busy, as shown below:

Time Process a(ctiviti) Process c(onsole)
1 INSERT something INTO mytable Process a is good to go
2 SELECT * FROM mytable Process c has to wait for process a
3 INSERT something INTO myothertable Process c still has to wait
4 commit Process a commits, finally process c can continue
5 commit Process c is now also finished

We tried out by searching in our console on primary keys. Then our console would respond immediately, as expected: because then only a lock of a small part of the table is needed.

To get around the problem, we decided to commit from within each delegate, basically after each step in the process. We checked of course with the product-owner: for consistency having some data visible during the execution of the process wasn't a problem.
DB2 doesn't support nested transaction, so in our delegate a new dbsession had to be opened (and closed):

EntityManagerFactoryImpl entityManagerFactory = (EntityManagerFactoryImpl) Persistence.
    createEntityManagerFactory(properties.getProperty("myunitname");
sessionFactory = entityManagerFactory.getSessionFactory();
Session session = sessionFactory.openSession();
session.getTransaction().begin();
...
session.getTransaction().commit();
session.close();

Note: if you use Spring-transactions, Spring will do the above boilerplate for you.

Now locks were held for a much shorter time. After this change, our console behaved normally.
Activiti with commit

The behavior of the process as listed before was in fact desired in the first place. After all writes, a flush statement existed in the original code. On DB2 that statement would indeed cause the data to be written to disk, but the data wouldn't be visible until the commit was performed.

Comments (3)

  1. Francisco Ripoli - Reply

    March 6, 2013 at 4:53 pm

    Couldn't that be achieved changing the isolation level setting on DB2?

    • Gerbrand van Dieijen - Reply

      March 6, 2013 at 10:02 pm

      Changing isolation level was mentioned by DBAer but we decided not use that: then all changes to the database would be visible immediately and disappear in case a rolllback would be done. That would destroy the whole of idea of using database in the first place.

  2. ilx - Reply

    March 12, 2013 at 12:10 am

    What version of db2 do you use? It really makes a difference.

    What was default isolation level (CS or RS)?
    Usually default is RS (REPEATABLE_READ), but for the most purposes you want CS (READ_COMMITED). Other than that, v9.7 has improved cur_commit for CS isolation level. See http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.perf.doc/doc/c0053760.html for more details. With this feature concurrency behaviour is the same as the one we can find in Oracle.
    Here's the link to the article that explains cur_commit: http://www.ibm.com/developerworks/data/library/techarticle/dm-0907oracleappsondb2/

    Also, there is DB2_SKIPINSERTED variable that is usually used to improve database concurrency.

Add a Comment