Golden Hibernate

Maarten Winkels

For our project we use Hibernate. The application we are building reads work items from the database, processes them (validation) and writes the results back to the database; a typical data processing application. Optimally the process would be streaming, a gigantic select would be used to fetch millions of rows and process each row in a transaction (The processing of a row results in several DML statements). Now there are technical obstacles to implementing the application in this fashion. The RDBMS should be able to process millions of short transactions, while keeping the long transaction that reads the rows alive. Oracle cannot handle this, due to its read consistency functionality. After quite some time a ORA-01555: snapshot too old (rollback segment too small) will inevitably crash the long running transaction. Our implementation divides the gigantic select in smaller chunks, to prevent the "snapshots" from getting "too old".

Pfff, the first obstacle was out of the way. Next problem: Hibernate. We chose Hibernate as our ORM solution, because... because... we were all already familiar with it. Which is the lamest excuse in the world and will mostly lead to solving the wrong problem with the wrong tool. The problem with Hibernate in this situation is at the same time one of its main features. To support transactional write-behind Hibernate keeps track of all objects loaded in its Session. During the batch processing all work items get loaded in the session. The memory associated with this isn't the biggest problem. Whenever a Session is flushed, Hibernate will inspect each associated object to look for changes and write these to the database. If the session gets big, flushing it will take more and more time, even if there are no changes, as is the case with this long read transaction. The solution for this is to evict the objects from the session as soon as possible. Since we read the objects from a ScrollableResults each result is loaded separately. We wrap this results object in an EvictingIterator that will evict every work item from the session. When using this approach, one has to be very careful to evict all objects, also the objects that are loaded by cascading. Luckily, 'evict' is a cascade option of Hibernate, so in the mapping files specify cascade='evict' on all associations that are loaded and... presto!

Now let's take a step back: what problem have we solved here? Using Hibernate to do the gigantic select to fetch the objects from the database has not helped us a single bit. Quite the opposite; we have to work around Hibernate's Session to make it work! This is exactly how a Golden Hammer can lead you astray. Instead of solving your problems, it leads to more problems that are solution specific. So after making it all work with Hibernate, we decided to invest some time in trying to find another solution. Our first attempt was Ibatis.

We decided to invest a week on the Ibatis PoC. We ended up spending most of the time cleaning up our configuration, code and tests, but finally we got to the actual Ibatis code. This is what we found:

The ProofOfConcept... disproven!

We expected some improvements from using Ibatis:

  1. Less code - No more working around Hibernate's session.
  2. Less SQL - Most of the queries are SQL queries of which the results are interpreted by Hibernate. This is because they contain query hints and constructs that are not easily reproduced by HQL or criteria. This leads to a lot of duplication, since some logical constructs (joins, restrictions) are used in several queries. Ibatis allows for statement-fragment-sharing.

We started out defining an insert statement for one of the objects. We use sequences to fill in the primary key in the database. With Hibernate we use the HiLoSequenceGenerator, to decrease the load on the database when inserting a lot of data. How would we implement this in Ibatis? Well, Ibatis has selectKey properties to generate the primary key of a newly inserted object. The hi/lo mechanism is a very smple extension that can be applied to any key generation mechanism. It simply applies a lineair transformation to the generated key, resulting in a whole bunch of keys. How can we implement this in Ibatis. Well, we could simply extend the SelectKeyStatement. Oke, but then we need Ibatis to use our own implementation, rather than the normal SelectKeyStatement implementation. So where is the object created? Is the factory pattern used? Is there a plug-in or extension point? Nope! The SelectKeyStatement is directly instantiated when reading the configuration file.

That's a bit disappointing! Oh well, maybe we're thinking too 'Hibernate-y' here. We could just forget about hi/lo, reset our sequences to the exact next value we need and then use them at every new insert. While we're at it, what extension points does Ibatis have? Well, browsing the sources we find... three extensions in the client! The engine does not have any extensions. This looks like a serious drawback. In my experience, ORM solutions are never a 100% fit. There's always this one table/object mapping that needs a few tweaks to get it to work correctly. A framework that offers such limited possibilties for extensions renders itself useless.

Oke, let's not give up that easily. We could still try and see how far we can get with Ibatis as it is. We only need to execute a handful of different SQL queries on the database and the result of each query should be mapped to a few objects. That should be possible. Let's write a test that will use a sequence and then insert an object. We use HSQLDB for our testing. It's fast, easy to use and can be run on all developer and testing platforms. So now we need to create an "insert"-statement in our SQL map configuration. The insert statment is simple; it's standard SQL:

INSERT INTO TABLE (COL1, COL2, COL3, ...) VALUES (?, ?, ?, ...)

This will work on any law-abiding database system. Now for the "selectKey"-statement... On Oracle it's something like

SELECT SEQUENCE.NEXTVAL FROM DUAL

On HSQLDB it's quite vague. A value from a sequence can only be selected in a normal "SELECT" statement. The statement should be executed on a table (or table-like structure) as specified in the "FROM"-clause. The statement can have a "WHERE"-clause with restrictions as well. A sequence number will then be selected for each row in the resultset that results from the statement. The following statement will thus result in a sequence number for each row in TABLE.

SELECT NEXTVALUE FOR SEQUENCE FROM TABLE

To fetch just a single sequence number, the statement should result in exactly one result. Thus, the table should hold only one row. This is exactly how Hibernate's HSQLDialect works with sequences:

  • It creates a sequence
  • It creates a table
  • It inserts a single record in the table
  • It uses the table to select next values for the sequence

That's a lot of work. Anyway, the main problem is not the work, but the fact that HSQLDB, the test RDBMS, handles sequences in a completely different manner than Oracle, the production RDBMS. Now, how can we configure Ibatis to run on a different RDBMS? Oh, you can't. Ibatis only knows about statements and only a single version of a statement, no database specific versions. Umpf, another big disappointment. So we'd have to make a separate version of the SQL Map configuration for testing, in the process defying the whole purpose of testing the DAO layer of our application. So the fact that Hibernate has Dialects that can be swapped in and out might be clouding our judgement, but we think this seriously compromises the testability of an application that uses Ibatis for its DAO layer. Testing against Oracle is very slow and hard to implement on all platforms (we'd have to install Oracle everywhere).

Conclusion

So after a few hours of trying to convert from Hibernate to Ibatis we have found two major disappointments: low extendability and low testability. We were unable to check whether Ibatis would solve the problems that we currently experience. We also couldn't check the expected improvements. So it's back to our good old golden hammer: Hibernate! There might be other ORM frameworks that we could investigate. There is JDO and TopLink, but these advanced solution will probably have the same problem as Hibernate: unused features that we have to work around to make it all work. We could of course invent our own framework, with exactly the features we need for this application, but... err... would that still be a framework? Perhaps our experience shows that Hibernate is just as good as it gets

Maarten Winkels
Machiel Groeneveld

Comments (9)

  1. Okke Harsta - Reply

    October 2, 2006 at 11:27 pm

    Did you consider using the StatelessSession? 'http://www.hibernate.org/hib_docs/v3/api/org/hibernate/StatelessSession.html'

  2. Maarten Winkels - Reply

    October 3, 2006 at 10:25 am

    Good suggestion! We did consider Stateless sessions before, but discarded the idea, because the mapping capabilities are quite limited (no collections). We never realy tried to implement it though. There might be room for another PoC yet...

  3. Jan - Reply

    October 5, 2006 at 2:26 am

    "The engine does not have any extensions. This looks like a serious drawback. In my experience, ORM solutions are never a 100% fit. There's always this one table/object mapping that needs a few tweaks to get it to work correctly."

    I do not quite understand. The "TypeHandlerCallback" interface provides you with a means to make just those tweaks - or am I missing something?

    On a further note with respect to:

    "Oracle cannot handle this, due to its read consistency functionality. After quite some time a ORA-01555: snapshot too old (rollback segment too small) will inevitably crash the long running transaction."

    This might look like an obstacle to you, but this "read consistency" is functionality you cannot do without in a highly concurrent and highly transactional environment.

    I haven't done any testing from a single iBatis mapping file and multiple, different type databases. Maybe BerkeleyDB behaves 'better' with respect to sequences, or maybe there is a way to use one of the ways to get the sequence nextvalue using a Dynamic Mapped Statements.

    IMO you can only compare Hibernate to iBatis to a certain degree. iBatis is typically well suited for environments where database/SQL savvy developers want to map their records onto objects (ROM) whereas Hibernate is typically used in top-down scenarios - the classmodel taking precedence over the data model (ORM).

  4. Maarten Winkels - Reply

    October 5, 2006 at 8:48 am

    Jan, thanks for your remarks.

    I do agree that comparing Ibatis and Hibernate is a bit like comparing apples and oranges. The blog is by no means a comparision. We felt that we had applied our (golden) hammer Hibernate to the wrong problem and expected (or hoped?) that Ibatis would do a better job. In the end we were not even able to decide whether Ibatis would do a better job, because we saw serious issues with the framework itself (not in comparison with Hibernate) with, as stated above, testability and extendability, important features to any framework.

    The "TypeHandlerCallback" interface is indeed a means to tweak reading and writing objects from a ResultSet. This is one of the three client-extensions. I think they're rightly named "client", since they allow adaption of the outward behaviour of the framework. What I miss though is "enigine"-extensions that allow adapting the behaviour of Ibatis itself, like the possibility to tweak the behaviour and usage of the SelectKeyStatement, as suggested in the blog.

    With regard to the read consistency; this is indeed very important in many situations, but not in ours. It would be helpful if we could somehow toggle this behaviour, but regretfully Oracle does not have this feature and our only option is to read in small chunks.

  5. Robin Bygrave - Reply

    November 13, 2006 at 10:14 am

    - If you have Oracle10 then you can use getGeneratedKeys rather than selecting the nextval.

    - If you are not going to use the Objects for update afterwards you can include the seq.nextval in the insert statement in Oracle9 as well.

    I have built a persistence layer called "Ebean" which you can check out at http://www.avaje.org. There I also raise the Architectural and API issues of EJB3.

    I certainly think Ebean would be a good fit for your problem but I'm a little biased 🙂

    Cheers, Rob.

  6. S12o - Reply

    March 12, 2007 at 12:53 pm

    About HSQL.

    try to use
    SELECT TOP1 NEXTVALUE FOR SEQUENCE FROM TABLE
    instead of
    SELECT NEXTVALUE FOR SEQUENCE FROM TABLE

    where TABLE - name of any table in the DB (for example it can be any system table like INFORMATION_SCHEMA.SYSTEM_COLUMNS) and SEQUENCE - the name of your sequance.

    it's worked

  7. Amir Pashazadeh - Reply

    May 1, 2007 at 10:20 pm

    Why don't you user StatelessSession in Hibernate?
    It just don't keep references to other objects, so it can not determine whether an object is modified or not. You must explicitly call update methods.

  8. eceppda - Reply

    June 12, 2007 at 9:33 pm

    I take issue with the basic assumption that a giant select grabbing millions of rows is a good idea.
    Batch processing is a common paradigm for updating millions of rows in a database, and would alleviate the issue with stale data.
    There are a multitude of ways to handle this, even the basic ResultSet was designed with this in mind (caching a set number of rows at a time as opposed to pulling millions of rows into memory).

    I agree that the HSQLDB method of using sequences is vague, but an easy solution is not terrible difficult- override the HSQLDialect. getSequenceNextValString() method to return an SQL statement of your choice.

Add a Comment