• Home
  • RSS Feed
  • Log in

Jeroen van Wilgenburg

Sorting and pagination with Hibernate Criteria – How it can go wrong with joins
Posted by Jeroen van Wilgenburg mid-morning: December 11th, 2008

Lately I ran into an annoying problem with Hibernate. I tried to do pagination on a query result which was doing an SQL-JOIN under the hood. The query before paging returned about 100 results. When I turned on paging (with 20 results per page) all the pages had less than 20 results!
The reason for this is that with a JOIN there can be duplicate results and those results are filtered out after pagination is done. In this blog I will explain how to solve those problems and it also a cleaner way to build your Criteria queries.

Let me explain this with some examples.
Our data model for today consists of doors and colors. A door can have multiple colors.
The data set to start with:
Door A:
-id: 1
-size: large
-colors: red and green

Door B:
-id: 2
-size: small
-color: blue and green

Now lets select all red, green or blue doors, that will be two doors right?

Criteria crit = session.createCriteria(Door.class);
crit.createAlias("colors", "c");
crit.add(Restrictions.in("c.name", new Object[]{ "red" , "green" , "blue" }));

Because of the colors alias an INNER JOIN is used and the result might not be what you expect.
The size of the list is 4, 4 Door objects are returned! That is very strange. When you have a look at the SQL you will see the following query:
select this_.id as id2_1_, this_.size as size2_1_, colors3_.DOOR_ID as DOOR1_, c1_.id as COLOR2_, c1_.id as id3_0_, c1_.name as name3_0_
from doors this_ inner join DOORS_COLORS colors3_ on this_.id=colors3_.DOOR_ID inner join colors c1_ on colors3_.COLOR_ID=c1_.id
where c1_.name in (?, ?, ?)

Translated to readable SQL:
SELECT * FROM doors d
INNER JOIN doors_colors dc on d.id=dc.door_id
INNER JOIN colors c ON c.id=dc.color_id
WHERE c.name in (?, ?, ?)

This query will indeed return 4 results, but we’re selecting Door objects, not SQL-rows. Looking at the java code I expected Door object, not query result rows wrapped in a Door object
To solve this problem you can add a ResultTransformer to the crit-object:

crit.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY);

Now only two doors are selected, problem solved.
Well, not so fast, let’s see what happens with many doors. Our database now has 16 doors. When we select the red, green and blue doors again, 11 results appears. Suppose we want to add paging to the results, because it’s just too many doors to handle at a time. 5 doors per page is enough.
We can achieve it by adding the following two lines:

crit.setFirstResult(0);
crit.setMaxResults(5);

11 doors found, 5 results per page and display the results starting from the first result (which is 0-indexed).
How many results do you expect on the first page?
Not the 3 results returned probably. The ResultTransformer is still active by the way. What happens is that the paging is applied before the duplicate entities are filtered out.

When you start thinking in SQL you might come up with the idea to use a subquery. This is exactly what we’re are going to do in Hibernate.
A subquery in pseudocode:
select * from doors d where d.id in (select id from doors, colors where colors.name in (‘red’,'green’,'blue’))
As you can see the subquery only selects the id column and in my previous example a complete Door object was selected.
To select other columns than the columns in the Entity object you can use Projections. With multiple columns use the ProjectionList object. For now we need an id-projection, which has is own method:

crit.setProjection(Projections.id());

Now a list with int’s is selected. This result is the input for the next query. You can create Criteria objects without a Hibernate Session for later use. These objects are called DetachedCriteria.

When we rename the crit to dc and create it as a DetachedCriteria you will have the following lines of code:

DetachedCriteria dc = DetachedCriteria.forClass(Door.class);
dc.createAlias( "colors" , "c");
dc.add(Restrictions.in( "c.name" , new Object[]{ "red" , "green" , "blue" }));
dc.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY);
dc.setProjection(Projections.id());

The next step is creating an outer query around the previous query.

Criteria outer = session.createCriteria(Door.class);
outer.add(Subqueries.propertyIn("id", dc));
outer.setFirstResult(0);
outer.setMaxResults(5);

With Subqueries.propertyIn you can add the previous query. When you add paging and sorting to the new outer query object all your problems are solved. You can even get rid of the ResultTransformer and get the same result. When you’re tuning performance it might be a nice test to see whether it makes any difference to do the DISTINCT before you pass the results to the outer query.

How you can solve DISTINCT problemns is explained reasonably in the Hibernate FAQ:
Hibernate does not return distinct results for a query with outer join fetching enabled for a collection (even if I use the distinct keyword)?

The lesson I learned with Hibernate is that you still have to know SQL and don’t trust the results blindly. If you don’t know why you have to use DISTINCT and how a JOIN presents its results you’ll get lost very soon.

With the DetachedCriteria object you can split your queries in a very readable way and when using projections, the SQL under the hood only selects the needed columns, which improves performance.

sources

http://floledermann.blogspot.com/2007/10/solving-hibernate-criterias-distinct.html
http://www.hibernate.org/117.html#A12

Share

Tags: criteria, Hibernate, pagination
Filed under Java | 11 Comments »



11 Responses to “Sorting and pagination with Hibernate Criteria – How it can go wrong with joins”



    Alex Says:
    Posted at: December 14, 2008 at 11:45 am

    I think it’s kind of sad that you need to compromise performance (the joins will be much faster on for instance mysql) because Hibernate won’t translate the SQL resultset to objects correctly when collections are involved.

    Hibernate should really have a configuration parameter which automatically does this for you.

    Reply


    Maarten Winkels Says:
    Posted at: December 21, 2008 at 5:46 pm

    @Alex: It is a bit strange to blame Hibernate for the fact that MySQL cannot optimize the subselect in the same way as it does the inner join: The DB logic is in both cases exactly the same. Furthermore, it has nothing to do with how Hibernate ‘translates the SQL resultset to objects’ or how this is incorrect. The fact is that you cannot write an SQL statement which involves an (inner) join on a table that might have multiple entries for the base table with a LIMIT (for MySQL) to return a predictable number of unique rows from the base table. You cannot request Hibernate to solve this for you.

    What you could do is look into the query.scroll() method and do pagination yourself. This will be a bit trickier, because you will have to do the UNIQUE and the pagination yourself, but you’ll be able to improve performance.

    Reply


    Muzaffer Says:
    Posted at: December 26, 2008 at 12:58 pm

    Hi Jeroen van Wilgenburg,

    Thanks for your explanation. I am using Criteria obj as follows
    1. Set the order by [column1]
    2. Get the resultset
    3. Here I want to change order by to [column2]. If I simply addOrder [column2] to the Criteria obj it will give me result in the order [column1],[column2]. Here I want result to be sorted only based on [column2]. Thanks.

    Reply


    Bhim Bomma Says:
    Posted at: January 7, 2009 at 1:56 am

    Thanks Jeroen. This helped me to solve the pagination problem in my application.

    Reply


    AGO Says:
    Posted at: January 8, 2009 at 11:09 pm

    Thank you very much for such nice article!
    It is short, with good example and without unnecessary fireworks – simply great.
    Regards, Albert

    Reply


    ctrlspace » Something to think about Says:
    Posted at: April 20, 2009 at 6:46 pm

    [...] did not realize that multiple joins are not possible using Hibernate’s criteria api.  This article is not related but solves a pagination problem I’ve had in the [...]

    Reply


    murali Says:
    Posted at: June 22, 2009 at 3:34 pm

    Good Article , Just one note : this approach would not work if the entities has EAGER Join on Collections

    Reply


    ptb Says:
    Posted at: June 26, 2009 at 2:53 pm

    I tried this solution but I couldn’t get it to work in cases where you need to add multiple properties to the projection list. Apparently Sql doesn’t allow multiple select items in a subselect.

    Reply


    Filme Noi Cinema Says:
    Posted at: November 30, 2009 at 3:10 pm

    Actually SQL allows multiple select items in subselect (or at least Oracle SQL does) like this:

    select * from person where (name, age) in (select name, age from ……. )

    But I guess that Hibernate hasn’t implemented this yet.

    Reply


    Markos Fragkakis Says:
    Posted at: November 24, 2010 at 4:26 pm

    Thanks for this, very well explained.

    Reply


    Tim Dennison Says:
    Posted at: September 16, 2011 at 3:48 pm

    Great explanation of this problem!

    I keep thinking about an alternative solution and wanted feedback. I created a custom ResultTransformer that delegates to the DistinctRootEntityResultTransformer for transformTuple() and transformList() operations. However, the transformList() operation also uses a couple of instance variables (number of objects desired, page index) to produce a “paged” List of objects to return. The effect is that paging is accomplished via “entities” instead of rows.

    Reply


Leave a Reply

Click here to cancel reply.


Xebia Sites

  • Xebia Corporate
  • Xebia France
  • Xebia India
  • Xebia Sweden

Categories

  • Java (311)
  • Agile (181)
  • General (136)
  • Scrum (67)
  • Architecture (64)
  • Testing (59)
  • Performance (46)
  • Middleware (56)
    • Deployment (38)
  • Xebia Labs (39)
  • SOA (31)
  • Podcast (31)
  • Project Management (28)
  • Tools (26)
  • Uncategorized (20)
  • lean architecture (20)
  • Quality Assurance (17)
  • Articles (13)
  • Requirements Management (13)
  • Virtualization (19)

Tag Cloud

    JPA Grails Groovy lean architectuur Maven Ajax Oracle Spring Eclipse Hibernate SOA Architecture Moving to India Concurrency Control XML Frameworks Scrum JPA implementation patterns Java TDD product owner Flex agile architectuur Lean Scala ACT Javascript lean architecture Agile Xebia

Archives

  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • March 2011
Avatars by Sterling Adventures