JPA implementation patterns: Lazy loading

Vincent Partington

In the previous three blogs about JPA implementation patterns, I covered the basis operations of saving entities, retrieving entities, and removing entities. In this blog I will continue along a different angle, exploring the subject of how entities are lazily loaded and how that affects your application.

Anybody that has been working with Hibernate for a while has probably seen a LazyInitializationException or two, usually followed by a message such as "failed to lazily initialize a collection of role: com.xebia.jpaip.order.Order.orderLines, no session or session was closed" or "could not initialize proxy - no Session". Even though these message may baffle new users of Hibernate, they are a lot better than the NullPointerExceptions OpenJPA gives you in these cases (at least when using runtime bytecode enhancement).

To use JPA to its full potential it is imperative to understand how lazy loading works, as it allows you to model your complete database with all its relations without loading that whole database as soon as you access just one entity.

Because of this it is unfortunate that the JPA 1.0 specification doesn't cover this subject in more depth than a few sentences along the lines of:

The EAGER strategy is a requirement on the persistence provider runtime that data must be eagerly fetched. The LAZY strategy is a hint to the persistence provider runtime that data should be fetched lazily when it is first accessed. The implementation is permitted to eagerly fetch data for which the LAZY strategy hint has been specified.

As of this writing, the proposed final draft of the JPA 2.0 specification does not add anything in this department. The best we can do for now is read the documentation of our JPA provider and do some experiments.

When lazy loading can occur

All @Basic, @OneToMany, @ManyToOne, @OneToOne, and @ManyToMany annotations have an optional parameter called fetch. If this parameter is set to FetchType.LAZY it is interpreted as a hint to the JPA provider that the loading of that field may be delayed until it is accessed for the first time:

  • the property value in case of a @Basic annotation,
  • the reference in case of a @ManyToOne or a @OneToOne annotation, or
  • the collection in case of a OneToMany or a @ManyToMany annotation.

The default is to load property values eagerly and to load collections lazily. Contrary to what you might expect if you have used plain Hibernate before, references are loaded eagerly by default.

Before discussing how to use lazy loading, let's have a look into how lazy loading may be implemented by your JPA provider.

Build-time bytecode instrumentation, run-time bytecode instrumentation and run-time proxies

To make lazy loading work, the JPA provider has to do some magic to make the objects that are not there yet appear as if they are there. There are a number of different ways JPA providers can accomplish this. The most popular methods are:

  • Build-time bytecode instrumentation - The entity classes are instrumented just after they have been compiled and before they are packaged to be run. The advantage of this approach is that bytecode instrumentation can provide the best performance and the least leaky abstraction. A disadvantage is that it requires you to change your build procedure and is (therefore) not always compatible with IDE's. the instrumented classes may be binary incompatible with their uninstrumented versions which could cause Java serialization problems and the like, but this is not something I have heard anybody mention as a problem yet.
  • Run-time bytecode instrumentation - Instead of instrumenting the entity classes at build-time, they can also be instrumented at run-time. This requires installing a Java agent using the -javaagent option from JDK 1.5 and upwards, using class retransformation when running under JDK 1.6 or a later version, or some proprietary method if you are using an older JDK. So while this method does not require you to modify your build procedure, it is very specific to the JDK you are using.
  • Run-time proxies - In this case the classes are not instrumented but the objects returned by the JPA provider are proxies to the actual entities. These proxies can be dynamic proxy classes, proxies that have been created by CGLIB, or proxy collections classes. While requiring the least setup, this method is the least transparent of the ones available to JPA implementors and therefore requires you to know most about them.

Run-time proxy based lazy loading with Hibernate

While Hibernate supports build-time bytecode instrumentation to enable lazy loading of individual properties, most users of Hibernate will be using run-time proxies; it is the default and works well for most cases. So let's explore Hibernate's run-time proxies.

Two kind of proxies are created by Hibernate:

  1. When lazily loading an entity through a lazy many-to-one or one-to-one association or by invoking EntityManager.getReference, Hibernate uses CGLIB to create a subclass of the entity class that acts a proxy to the real entity. The first time any method on that proxy is invoked, the entity is loaded from the database and the method call is passed on to the loaded entity. My colleague Maarten Winkels has blogged about the pitfalls of these Hibernate proxies last year.
  2. When lazily loading a collection of entities though a one-to-many or a many-to-many association, Hibernate returns an instance of a class that implements the PersistentCollection interface such as PersistentSet or PersistentMap. The first time that collection is accessed, its members are loaded. The member entities are loaded as regular classes, so the Hibernate proxy pitfalls mentioned above don't apply here.

To get a feel for what happens here, you might want to step through some simple JPA code in the debugger and see the objects that Hibernate creates. It will increase your understanding if the mechanism a lot. :-)

Run-time bytecode instrumentation with OpenJPA

OpenJPA offers a number of enhancement methods, as the documentation calls it, of which I found run-time bytecode instrumentation the easiest to set up.

Stepping through the debugger you can see that OpenJPA does not create proxies. Instead a few extra fields have appeared in each entity class, with names like pcStateManager or pcDetachedState. More importantly you can see that a lazily loaded entity has all its fields set to 0 or null and that its state is only loaded when a method is invoked on it. More precisely, a property that is configured to be loaded lazily is only loaded when its getter is invoked.

It is very important to know that direct access to the fields of a lazily loaded entity (or the field behind a lazily loaded property) does not trigger loading of that entity (or field). Also, when the session is no longer available OpenJPA does not throw an exception as Hibernate does but just leaves the values in their uninitialized state, later causing the NullPointerExceptions I mentioned above.

OpenJPA vs. Hibernate

The first difference between these two approach you might notice is the objects that get proxied/instrumented:

  • OpenJPA instruments all entities which means it can detect when you access a lazy reference or collection from the referring entity and it will then return an actual entity or a collection of actual entities. Only when you lazily load an entity using EntityManager.getReference or when you have configured a property to be lazily loaded, will you get a (partially) empty entity.
  • In the case of a lazy reference (or an entity that has been lazily loaded with EntityManager.getReference) Hibernate proxies the lazy object itself using CGLIB, which causes the proxy pitfalls mentioned before. When using a lazy collection Hibernate is just as transparent as OpenJPA. Finally, Hibernate does not support lazily loaded properties using proxies.

If you compare OpenJPA's instrumentation to the run-time proxies created by Hibernate you can see that the approach taken by OpenJPA is more transparent. Unfortunately it is let down somewhat by OpenJPA's less than robust error handling.

The pattern

So now that we know how to configure lazy loading and how it works, how can we use it properly?

Step 1 would be to examine all your associations and see which should be lazily loaded and which should be eagerly loaded. As a rule of thumb I start out leaving all *-to-one associations eager (the default). They usually don't add up to a large number of queries anyway and if they do, I can change them. Then I examine all *-to-many associations. If any of them are to entities that are always accessed and therefore always loaded, I configure them to be loaded eagerly. And sometimes I use the Hibernate specific @CollectionOfElements annotation to map such "value type" entities.

Step 2 is the most important. To prevent any LazyInitializationExceptions or NullPointerExceptions you need to make sure that all access to your domain objects occurs within one transaction. When domain objects are accessed after a transaction has finished, the persistence context can no longer be accessed to load the lazy objects, and that causes these problems. There are two ways to solve this:

  1. The most pure way is to place a Service Facade (or Remote Facade if you will) in front of your services and only communicate with clients of your service facade through Transfer Objects (a.k.a. Data Transfer Objects a.k.a. DTO's). The facade's responsibility is to copy all appropriate values from your domain objects to the data transfer objects, including making deep copies of references and collections. The transaction scope of your application should include the service facade for this pattern to work, i.e. set your facade to be @Transactional or give it a proper @TransactionAttribute.
  2. If you are writing a Model 2 web application with an MVC framework, a widely used alternative is to use the open EntityManager in view pattern. In Spring you can configure a Servlet filter or a Web MVC interceptor that will open the entity manager when a request comes in and will keep it open until the request has been handled. The means the same transaction is active in your controller and in your view (JSP or otherwise). While purists may argue that this makes your presentation layer depends on your domain objects, it is a compelling approach for simple web applications.

Step 3 is to enable SQL logging of your JPA provider and exercise some of the use cases of your applications. It is enlightening to see what queries are performed when entities are accessed. The SQL log can also provide you with input for performance optimizations so you can revisit the decisions you made in step 1 and tune your database. In the end lazy loading is all about performance, so don't forget this step!

I hope this blog has given you some insight into how lazy loading works and how you can use it in your application. In the next blog I will delve deeper into the topic of the DTO and Service Facade patterns. But before I leave you I would like to thank everybody that came to my J-Spring 2009 talk on this subject. I had a lot of fun! It really seems a lot of people are wondering how to effectively use JPA because I got a lot of questions. Unfortunately the questions made me run out of time. Next time I will pay more attention to the girl with the time card. And bring my own water. ;-) Thanks again for being there!

P.S. Does anybody know what happened to hibernate.org? For more than a week the site has been showing a message saying they are down for maintenance.

For a list of all the JPA implementation pattern blogs, please refer to the JPA implementation patterns wrap-up.

Comments (19)

  1. Andries Inzé - Reply

    April 28, 2009 at 7:54 am

    Excellent post!

    How do you feel about a third way to load your objects, by invoking Hibernate.initialise(object) on your proxied members.

    I've found this saves you from dealing with DTO's. A (large) turnside is that it can be a pain, when you need to eager fetch members of a collection. However, one should not use lazy loading in that instance anyway.

    Kind regards,
    Andries

  2. Dimitris Menounos - Reply

    April 28, 2009 at 10:35 am

    None of the "Step 2" patterns are appealing enough and DTOs are code duplication.

  3. p3t0r - Reply

    April 28, 2009 at 6:43 pm

    @Andries Inzé

    I've used the Hibernate.initialise(object) intensively on past projects but it in the end its' quite messy... you can never be sure in which state the entity you get returned is. DTO's communicate the intent much better.

    And what do you mean by 'one should not use lazy loading in that instance anyway' ... I really think you should always use lazy loading to avoid unwanted cartesian products in list queries.

  4. Vincent Partington - Reply

    April 29, 2009 at 8:57 am

    @Andries Inzé: I haven't used Hibernate.initialize, but I would imagine it to become quite messy as you and p3t0r have already found out. Also, it is Hibernate specific and therefore won't work with other JPA implementations.

  5. Egifford - Reply

    May 5, 2009 at 1:10 pm

    Great post - very detailed an with reference to spec. SUPER!!! keep doing that kind of work - than the world of JPA becomes easier :) Thanks

  6. Eswar - Reply

    July 11, 2009 at 6:45 am

    Excellent information.

  7. [...] Lazy loading [...]

  8. alvin - Reply

    July 16, 2009 at 3:50 pm

    Something which I see people forgetting to mention is that if you have multiple lazily loaded collections on a single entity - the first time you access one of those collections (causing it to load) - it will also load the other.

  9. Vincent Partington - Reply

    July 25, 2009 at 10:22 am

    @Alvin: Are you saying that if my Order contains not only a set of OrderLines but also a set of Customers both of these collections would be loaded whenever I access either one of them? That is not something I have experienced with Hibernate. But I guess this would be JPA provider dependent as the JPA spec only says the LAZY annotation is a "hint" to the JPA provider. In which JPA provider have you seen this behaviour?

    Something I have noticed is that accessing a collection causes Hibernate to load all the entities in that collection, even if you are just adding or removing an entity. This can become a problem when you have large collections and especially when combined with managing bidirectional associations.

  10. allan - Reply

    August 11, 2009 at 8:23 am

    For step 2, another option might be, make the DAO statefull session bean and use extended PersistenceContext.

  11. Simon Massey - Reply

    September 17, 2009 at 11:50 pm

    I also love this series. It has even almost turned me to the dark-side of accepting that DAO is the correct approach...

    I agree with @Dimitris that the "option 2" approaches seem unsatisfactory. Why not just have façades that are use case orientated and @transactional with methods which know how to initialize lazy objects (or call the correct eager queries) and have clients such as the web UI always go via the façades?

    I found the following statement strange:

    "While purists may argue that this makes your presentation layer depends on your domain objects, it is a compelling approach for simple web applications."

    Surely I want all of my platform at all layers dependent upon my domain objects as they model the business problem space? Anything built to be a business solution should be familiar with, and make reference to, the problem domain. I tend to draw the "domain layer" as a vertical stack to the side of my "cake diagram" of the application logical layers and tell all of the developers that they must try to stick almost exclusively to using the domain objects wherever possible.

    We use a highly stateful ajax framework (ZK) for very complex web applications that has a declarative bi-directional databindings feature. I can then bind my domain objects to my web desktop components. So I use my domain objects everywhere within my presentation layer as detached objects. We don't use open session in view to avoid the scalability issue of leaving the session open whilst building the view. The use case orientated façades that scope the transaction can take care of calling the re-attach logic before any save which you covered in your previous blog on saving detached entities.

    I got a cold shiver when I read the recommended strategy of copying the domain objects to DTOs or VOs. On large projects the programmers that don't have a strong experience of JPA/JDO/Hibernate they are going to think "DTO = result set" and fall back to obsolete patterns. Pretty soon you get an explosion of different but mostly similar DTOs that model the screens that the developers are building. They then makes lots of trips down the stack to pull other similar looking DTO by ID when the current one is not quite the right. JDBC IO saturation ensues and the domain pattern and ORM can go right out of the window.

  12. Vincent Partington - Reply

    September 18, 2009 at 10:52 am

    @Simon Massey: By the "option 2 pattern", do you mean using either a Service Facade with DTO's or using the open EntityManager in view pattern? These are the two patterns I've seen most often in real applications. What don't you like about them?

    If I understand correctly, your suggestion is a mix between the two approaches. There is a Service Facade but instead of using DTO's you have the Service Facade make sure all domain objects are correctly initialized so that you can just return them as-is.

    That would work, but it makes the clients layer depend on the domain objects. Which brings me nicely to your other point. ;-)

    Whether that is good thing or a bad thing is an interesting discussion. Your point that it can make inexperienced developers just write a lot or unnecessary, incorrect, inefficient boilerplate code is a good one. Then again, inexperienced developers are likely to make other mistakes. So then the questions is: do we want architecture to "defend" against inexperienced developers. Or do we perform code reviews andteach how to use the architecture correctly? But that is a different issue altogether.

    Back to the point: a good reason to have that isolation layer of the Service Facade/DTO combo is so that you can modify your domain objects without it having a direct impact on your client(s). But that depends on how closely the two are tied. Erik Rozendaal's comment to my blog on Service Facade and DTO's has some good points on this.

  13. Simon Massey - Reply

    September 23, 2009 at 11:22 am

    @Vincent I posted a comment over on the other blog where Erik was posting. I will move my discussion over there to your blog on Service Facade and DTOs.

  14. Simon Massey - Reply

    September 25, 2009 at 12:28 pm

    @Vincent: "These are the two patterns I’ve seen most often in real applications. What don’t you like about them?"

    The DTO model is discussed extensively on the next blog post but I thought that I would say when I would choose not to use Open Session In View. Open Session In View (on a spring based webapp) tethers the web container thread to the database connection. If I where to assign twice as many web threads as the maximum size of my db connection pool it would have little or no effect. Web threads would simply queue for a db connection to be returned to the pool. More significantly the db transaction (when using JTA) would be active over the rendering phase and that can cause increased contention and increased resource pinning within the database. (Without JTA things like JPA transactions actually do a single phase jdbc transaction for the duration of the flush which occurs when the session is closed. This makes holding the transaction open for longer less of an issue when you don't require JTA).

    If I go with a facade instead and scope my db transaction to a façade the db connection will be returned to the pool when the façades returns. I can then move my UI rendering to after the façade method returns and also perform some none-db based UI input validation before calling the façade. I can then keep more web threads working than I have database connections and have shorter JTA transactions. So I can horizontally scale my application layer infront of my db to better effect.

    If I know that my app will never need to be horizontally scaled as it is low volume app and I can live with tethering the db pool to my rendering phase then Open Session In View is much easier to develop with so I would use that. (We use the ZK RIA framwork which even has explicit support for Hibernate Session In View). In contrast if I know that I will need to scale out then the added effort of working via façades is the cost of being able to scale out. If the app is low complexity such that the façades between UI controllers and the services looks like too much layering then I might consider in-lining the façade logic into one of the other layers within a spring transaction template which scopes the db transaction (and hence the db transaction).

  15. Vincent Partington - Reply

    September 27, 2009 at 6:33 pm

    @Simon Massey: Your objections to the Open Session in View patterns are interesting. The fact that the database connection is still held during the rendering phase could degrade performance, but this will also depend on how long that phase takes compared to the actual business logic invoked. I would be interested to see compare some numbers there.

    Especially when you take out the time spent on loading lazy objects during the rendering phase. Because that time would move to the DTO-creation phase when using DTO's or the make-sure-every-entity-is-initialized-phase in your approach, making the pre-rendering phase longer.

    Your last paragraph is a good summary: use the Open Session in View when you can, and only introduce Service Facades and/or DTO's when you need to. This is similar to what I just replied to your question on the blog about Service Facades and DTO's. I guess you can say "it depends". ;-)

  16. [...] JPA gère le lazy-loading avec sa propre technique. Je vous conseille de lire l’article JPA Implementation patterns : Lazy loading pour plus d’information à ce [...]

  17. JPA Implementation Patterns | Upthrust - Reply

    April 12, 2010 at 6:12 am

    [...] Lazy Loading [...]

  18. Vitaliy Bogdan - Reply

    September 9, 2010 at 1:29 pm

    @PersistenceContext(unitName="bigbrother", type=PersistenceContextType.EXTENDED)
            public void setEntityManager(EntityManager entityManager) {
                    this.entityManager = entityManager;
            }
  19. Srinivas Nandina - Reply

    October 21, 2010 at 6:27 am

    If @TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED) is used on stateless session bean method, then lazy loading of relations from already loaded entity is throwing Lazy initialization Exception. Even you mentioned in this article that "set your facade to be @Transactional or give it a proper @TransactionAttribute". Is there is any reason behind this as I find it very counter intuitive to have my methods run in a transaction for lazy loading to work?.

    Thanks,
    Srinivas

Add a Comment