Why did Hibernate update my database?
Hibernate is a sophisticated ORM framework, that will manage the state of your persistent data for you. Handing over the important but difficult task of managing persistent state of your application to a framework has numerous advantages, but one of the disadvantages is that you sort of lose control over what happens where and when. One example of this is the dirty checking feature that Hibernate provides. By doing dirty checking, Hibernate determines what data needs to be updated in your database. In many cases, this feature is quite useful and will work without any issues, but sometimes you might find that Hibernate decides to update something that you did not expect. Finding out why his happened can be a rather difficult task.
I was asked to look into some issue with a StaleObjectState exception the other day. StaleObjectState exceptions are used by hibernate to signal an optimistic locking conflict: While some user (or process) tries to save a data item, the same data item has already been changed in the underlying database since it was last read. Now the problem was that the process that was throwing the exception was the only process that was supposed to change that data. From a functional point of view there could not have been any other user or process that changed the data in the meantime. So what was going on?
Digging around in the log for some time, we found that the data was updated by some other process that was supposed to only read that data. Somehow Hibernate decided that the data read by that process had become dirty and should be saved. So now he had to find out why Hibernate thought that data was dirty.
Hibernate can perform dirty checking in several places in an application:
- When a transaction is being committed or a session is being flushed, obviously, because at that time changes made in the transaction or session should be persisted to the database
- When a query is being executed. To prevent missing changes that still reside in memory, Hibernate will flush data that might be queried to the database just before executing the query. It tries to be picky about this and not flush everything all the time, but only the data that might be queried.
It is quite difficult to check all these places to find out where the data is being find dirty, especially when the process executes several queries.
To find out why Hibernate deems the data to be dirty, we have to dig into the Hibernate internals and start debugging the framework code. The Hibernate architecture is quite complex. There are a number of classes that are involved in dirty checking and updating entities:
- The DefaultFlushEntityEventListener determines what fields are dirty. The internals of this class work on the list of properties of an entity and two lists of values: the values as loaded from the database and the values as currently known to the session. It delegates finding out the ''dirty-ness' of a field to the registered Interceptor and to the types of the properties.
- The EntityUpdateAction is responsible for doing the update itself. An object of this type will be added to a ActionQueue to be executed when a session is flushed.
These classes show some of the patterns used in the internals of Hibernate: eventing and action queuing. These patterns make the architecture of the framework very clear, but they also make following what is going on sometimes very hard...
As previously explained, flushing happens quite often and setting a breakpoint in the DefaultFlushEntityEventListener is not usually a good idea, because it will get hit very often. An EntityUpdateAction, however, will only get created when an update will be issued to the underlying database. So to find out what the problem was, I set a breakpoint in the constructor and backtracked from there. It turned out Hibernate could not determine the dirty state of the object and therefor decided to update the entity just to be save.
As mentioned eralier, Hibernate uses the "loaded state" to determine whether an object is dirty. This is the state of the object (the values of its properties) when loaded form the database. Hibernate stores this information in its persistence context. When dirty checking, Hibernate compares these values to the current values. When the "loaded state" is not available, Hibernate effectively cannot do dirty checking and deems the object dirty. The only scenario, however, in which the loaded state is unavailable is when the object has been re-attached to the session and thus not loaded from the database. The process I was looking into, however did not work with detached data.
There is one other scenario in which Hibernate will lose the "loaded state" of the data: When the session is being cleared. This operation will discard all state in the persistence context completely. It is quite a dangerous operation to use in your application code and it should only be invoked if you are very sure of what you're doing. In our situation, the session was being flushed and cleared at some point, leading to the unwanted updates and eventually the StaleObjectStateExceptions. An unwanted situation indeed. After removing the clear, the updates where gone and the bug was fixed.
Using Hibernate can save a developer a lot of time, when things are running smoothly. When a problem is encountered, a lot of specialized Hibernate knowledge and a considerable amount of time is often needed to diagnose and solve it.