For our project we use Hibernate. The application we are building reads work items from the database, processes them (validation) and writes the results back to the database; a typical data processing application. Optimally the process would be streaming, a gigantic select would be used to fetch millions of rows and process each row in a transaction (The processing of a row results in several DML statements). Now there are technical obstacles to implementing the application in this fashion. The RDBMS should be able to process millions of short transactions, while keeping the long transaction that reads the rows alive. Oracle cannot handle this, due to its read consistency functionality. After quite some time a ORA-01555: snapshot too old (rollback segment too small) will inevitably crash the long running transaction. Our implementation divides the gigantic select in smaller chunks, to prevent the "snapshots" from getting "too old".
Pfff, the first obstacle was out of the way. Next problem: Hibernate. We chose Hibernate as our ORM solution, because... because... we were all already familiar with it. Which is the lamest excuse in the world and will mostly lead to solving the wrong problem with the wrong tool. The problem with Hibernate in this situation is at the same time one of its main features. To support transactional write-behind Hibernate keeps track of all objects loaded in its Session. During the batch processing all work items get loaded in the session. The memory associated with this isn't the biggest problem. Whenever a Session is flushed, Hibernate will inspect each associated object to look for changes and write these to the database. If the session gets big, flushing it will take more and more time, even if there are no changes, as is the case with this long read transaction. The solution for this is to evict the objects from the session as soon as possible. Since we read the objects from a ScrollableResults each result is loaded separately. We wrap this results object in an EvictingIterator that will evict every work item from the session. When using this approach, one has to be very careful to evict all objects, also the objects that are loaded by cascading. Luckily, 'evict' is a cascade option of Hibernate, so in the mapping files specify cascade='evict' on all associations that are loaded and... presto!
Now let's take a step back: what problem have we solved here? Using Hibernate to do the gigantic select to fetch the objects from the database has not helped us a single bit. Quite the opposite; we have to work around Hibernate's Session to make it work! This is exactly how a Golden Hammer can lead you astray. Instead of solving your problems, it leads to more problems that are solution specific. So after making it all work with Hibernate, we decided to invest some time in trying to find another solution. Our first attempt was Ibatis.
We decided to invest a week on the Ibatis PoC. We ended up spending most of the time cleaning up our configuration, code and tests, but finally we got to the actual Ibatis code. This is what we found:
The ProofOfConcept... disproven!
We expected some improvements from using Ibatis:
- Less code - No more working around Hibernate's session.
- Less SQL - Most of the queries are SQL queries of which the results are interpreted by Hibernate. This is because they contain query hints and constructs that are not easily reproduced by HQL or criteria. This leads to a lot of duplication, since some logical constructs (joins, restrictions) are used in several queries. Ibatis allows for statement-fragment-sharing.
We started out defining an insert statement for one of the objects. We use sequences to fill in the primary key in the database. With Hibernate we use the HiLoSequenceGenerator, to decrease the load on the database when inserting a lot of data. How would we implement this in Ibatis? Well, Ibatis has selectKey properties to generate the primary key of a newly inserted object. The hi/lo mechanism is a very smple extension that can be applied to any key generation mechanism. It simply applies a lineair transformation to the generated key, resulting in a whole bunch of keys. How can we implement this in Ibatis. Well, we could simply extend the SelectKeyStatement. Oke, but then we need Ibatis to use our own implementation, rather than the normal SelectKeyStatement implementation. So where is the object created? Is the factory pattern used? Is there a plug-in or extension point? Nope! The SelectKeyStatement is directly instantiated when reading the configuration file.
That's a bit disappointing! Oh well, maybe we're thinking too 'Hibernate-y' here. We could just forget about hi/lo, reset our sequences to the exact next value we need and then use them at every new insert. While we're at it, what extension points does Ibatis have? Well, browsing the sources we find... three extensions in the client! The engine does not have any extensions. This looks like a serious drawback. In my experience, ORM solutions are never a 100% fit. There's always this one table/object mapping that needs a few tweaks to get it to work correctly. A framework that offers such limited possibilties for extensions renders itself useless.
Oke, let's not give up that easily. We could still try and see how far we can get with Ibatis as it is. We only need to execute a handful of different SQL queries on the database and the result of each query should be mapped to a few objects. That should be possible. Let's write a test that will use a sequence and then insert an object. We use HSQLDB for our testing. It's fast, easy to use and can be run on all developer and testing platforms. So now we need to create an "insert"-statement in our SQL map configuration. The insert statment is simple; it's standard SQL:
INSERT INTO TABLE (COL1, COL2, COL3, ...) VALUES (?, ?, ?, ...)
This will work on any law-abiding database system. Now for the "selectKey"-statement... On Oracle it's something like
SELECT SEQUENCE.NEXTVAL FROM DUAL
On HSQLDB it's quite vague. A value from a sequence can only be selected in a normal "SELECT" statement. The statement should be executed on a table (or table-like structure) as specified in the "FROM"-clause. The statement can have a "WHERE"-clause with restrictions as well. A sequence number will then be selected for each row in the resultset that results from the statement. The following statement will thus result in a sequence number for each row in TABLE.
SELECT NEXTVALUE FOR SEQUENCE FROM TABLE
To fetch just a single sequence number, the statement should result in exactly one result. Thus, the table should hold only one row. This is exactly how Hibernate's HSQLDialect works with sequences:
- It creates a sequence
- It creates a table
- It inserts a single record in the table
- It uses the table to select next values for the sequence
That's a lot of work. Anyway, the main problem is not the work, but the fact that HSQLDB, the test RDBMS, handles sequences in a completely different manner than Oracle, the production RDBMS. Now, how can we configure Ibatis to run on a different RDBMS? Oh, you can't. Ibatis only knows about statements and only a single version of a statement, no database specific versions. Umpf, another big disappointment. So we'd have to make a separate version of the SQL Map configuration for testing, in the process defying the whole purpose of testing the DAO layer of our application. So the fact that Hibernate has Dialects that can be swapped in and out might be clouding our judgement, but we think this seriously compromises the testability of an application that uses Ibatis for its DAO layer. Testing against Oracle is very slow and hard to implement on all platforms (we'd have to install Oracle everywhere).
So after a few hours of trying to convert from Hibernate to Ibatis we have found two major disappointments: low extendability and low testability. We were unable to check whether Ibatis would solve the problems that we currently experience. We also couldn't check the expected improvements. So it's back to our good old golden hammer: Hibernate! There might be other ORM frameworks that we could investigate. There is JDO and TopLink, but these advanced solution will probably have the same problem as Hibernate: unused features that we have to work around to make it all work. We could of course invent our own framework, with exactly the features we need for this application, but... err... would that still be a framework? Perhaps our experience shows that Hibernate is just as good as it gets