JPA implementation patterns: Saving (detached) entities

Vincent Partington

We kicked off our hunt for JPA implementation patterns with the Data Access Object pattern and continued with the discussion of how to manage bidirectional associations. This week we touch upon a subject that may seem trivial at first: how to save an entity.

Saving an entity in JPA is simple, right? We just pass the object we want to persist to EntityManager.persist. It all seems to work quite well until we run into the dreaded "detached entity passed to persist" message. Or a similar message when we use a different JPA provider than the Hibernate EntityManager.

So what is that detached entity the message talks about? A detached entity (a.k.a. a detached object) is an object that has the same ID as an entity in the persistence store but that is no longer part of a persistence context (the scope of an EntityManager session). The two most common causes for this are:

  • The EntityManager from which the object was retrieved has been closed.
  • The object was received from outside of our application, e.g. as part of a form submission, a remoting protocol such as Hessian, or through a BlazeDS AMF Channel from a Flex client.

The contract for persist (see section 3.2.1 of the JPA 1.0 spec) explicitly states that an EntityExistsException is thrown by the persist method when the object passed in is a detached entity. Or any other PersistenceException when the persistence context is flushed or the transaction is committed. Note that it is not a problem to persist the same object twice within one transaction. The second invocation will just be ignored, although the persist operation might be cascaded to any associations of the entity that were added since the first invocation. Apart from that latter consideration there is no need to invoke EntityManager.persist on an already persisted entity because any changes will automatically be saved at flush or commit time.

saveOrUpdate vs. merge

Those of you that have worked with plain Hibernate will probably have grown quite accustomed to using the Session.saveOrUpdate method to save entities. The saveOrUpdate method figures out whether the object is new or has already been saved before. In the first case the entity is saved, in the latter case it is updated.

When switching from Hibernate to JPA a lot of people are dismayed to find that method missing. The closest alternative seems to be the EntityManager.merge method, but there is a big difference that has important implications. The Session.saveOrUpdate method, and its cousin Session.update, attach the passed entity to the persistence context while EntityManager.merge method copies the state of the passed object to the persistent entity with the same identifier and then return a reference to that persistent entity. The object passed is not attached to the persistence context.

That means that after invoking EntityManager.merge, we have to use the entity reference returned from that method in place of the original object passed in. This is unlike the the way one can simply invoke EntityManager.persist on an object (even multiple times as mentioned above!) to save it and continue to use the original object. Hibernate's Session.saveOrUpdate does share that nice behaviour with EntityManager.persist (or rather Session.save) even when updating, but it has one big drawback; if an entity with the same ID as the one we are trying to update, i.e. reattach, is already part of the persistence context, a NonUniqueObjectException is thrown. And figuring out what piece of code persisted (or merged or retrieved) that other entity is harder than figuring out why we get a "detached entity passed to persist" message.

Putting it all together

So let's examine the three possible cases and what the different methods do:

Scenario EntityManager.persist EntityManager.merge SessionManager.saveOrUpdate
Object passed was never persisted 1. Object added to persistence context as new entity
2. New entity inserted into database at flush/commit
1. State copied to new entity.
2. New entity added to persistence context
3. New entity inserted into database at flush/commit
4. New entity returned
1. Object added to persistence context as new entity
2. New entity inserted into database at flush/commit
Object was previously persisted, but not loaded in this persistence context 1. EntityExistsException thrown (or a PersistenceException at flush/commit) 2. Existing entity loaded.
2. State copied from object to loaded entity
3. Loaded entity updated in database at flush/commit
4. Loaded entity returned
1. Object added to persistence context
2. Loaded entity updated in database at flush/commit
Object was previously persisted and already loaded in this persistence context 1. EntityExistsException thrown (or a PersistenceException at flush or commit time) 1. State from object copied to loaded entity
2. Loaded entity updated in database at flush/commit
3. Loaded entity returned
1. NonUniqueObjectException thrown

Looking at that table one may begin to understand why the saveOrUpdate method never became a part of the JPA specification and why the JSR members instead choose to go with the merge method. BTW, you can find a different angle on the saveOrUpdate vs. merge problem in Stevi Deter's blog about the subject.

The problem with merge

Before we continue, we need to discuss one disadvantage of the way EntityManager.merge works; it can easily break bidirectional associations. Consider the example with the Order and OrderLine classes from the previous blog in this series. If an updated OrderLine object is received from a web front end (or from a Hessian client, or a Flex application, etc.) the order field might be set to null. If that object is then merged with an already loaded entity, the order field of that entity is set to null. But it won't be removed from the orderLines set of the Order it used to refer to, thereby breaking the invariant that every element in an Order's orderLines set has its order field set to point back at that Order.

In this case, or other cases where the simplistic way EntityManager.merge copies the object state into the loaded entity causes problems, we can fall back to the DIY merge pattern. Instead of invoking EntityManager.merge we invoke EntityManager.find to find the existing entity and copy over the state ourselves. If EntityManager.find returns null we can decide whether to persist the received object or throw an exception. Applied to the Order class this pattern could be implemented like this:

	Order existingOrder = dao.findById(receivedOrder.getId());
	if(existingOrder == null) {
		dao.persist(receivedOrder);
	} else {
		existingOrder.setCustomerName(receivedOrder.getCustomerName());
		existingOrder.setDate(receivedOrder.getDate());
	}

The pattern

So where does all this leave us? The rule of thumb I stick to is this:

  • When and only when (and preferably where) we create a new entity, invoke EntityManager.persist to save it. This makes perfect sense when we view our domain access objects as collections. I call this the persist-on-new pattern.
  • When updating an existing entity, we do not invoke any EntityManager method; the JPA provider will automatically update the database at flush or commit time.
  • When we receive an updated version of an existing simple entity (an entity with no references to other entities) from outside of our application and want to save the new state, we invoke EntityManager.merge to copy that state into the persistence context. Because of the way merging works, we can also do this if we are unsure whether the object has been already persisted.
  • When we need more control over the merging process, we use the DIY merge pattern.

I hope this blog gives you some pointers on how to save entities and how to work with detached entities. We'll get back to detached entities when we discuss Data Transfer Objects in a later blog. But next week we'll handle a number of common entity retrieval pattern first. In the meantime your feedback is welcome. What are your JPA patterns?

For a list of all the JPA implementation pattern blogs, please refer to the JPA implementation patterns wrap-up.

Comments (27)

  1. Matthew - Reply

    March 23, 2009 at 11:37 pm

    You should check out JDO's handling of this issue. JDO 2.x uses the same method, PersistenceManager.makePeristent(Object) for both persisting new objects and merging detached objects. There is no distinction between persisting and attaching. Because the method returns Object, you should always use the one returned; sometimes, it will be the same instance given (in the case of a transient object being made newly persistent), and others, it will return a copy of the instance given (if the given instance was detached).

    JDO also takes care to ensure that bidi associations are maintained. From section 15.3 of the JDO 2.2 specification:
    =====
    If two relationships (one on each side of an association) are mapped to the same column, the field
    on only one side of the association needs to be explicitly mapped.
    The field on the other side of the relationship can be mapped by using the mapped-by attribute identifying
    the field on the side that defines the mapping. Regardless of which side changes the relationship,
    flush (whether done as part of commit or explicitly by the user) will modify the datastore to
    reflect the change and will update the memory model for consistency. There is no further behavior
    implied by having both sides of the relationship map to the same database column(s). In particular,
    making a change to one side of the relationship does not imply any runtime behavior by the JDO
    implementation to change the other side of the relationship in memory prior to flush, and there is no
    requirement to load fields affected by the change if they are not already loaded.
    =====
    Nice stuff. Everyone should check it out.

    -matthew

  2. Merijn Vogel - Reply

    March 24, 2009 at 11:52 am

    I observe newer data in a database admin tool than I get returned from em.refresh(foo) as well as em.find(foo) and foo = em.merge(foo)
    Where could I be in error?

  3. Merijn Vogel - Reply

    March 24, 2009 at 12:15 pm

    Ah, leave my previous comment out, it was some other error that caused the observed behaviour.

  4. Alex Snaps - Reply

    March 24, 2009 at 1:41 pm

    Wouldn't that rule out optimistic locking? Or at least be very intrusive about it's handling (i.e. manual vs. handled by the provider).
    Besides this behavior is related to the "Serialization" method you name in the article, as, apparently, these do not honor the Serialization process of Java. Should they (and I know Hessian does not!), uninitialized associations would remain proxies and be handled by the provider accordingly.
    I have the honest impression this results more in fighting the framework, than making it work for you...

  5. Vincent Partington - Reply

    March 26, 2009 at 6:07 pm

    @Alex: how does the DIY merge rule out optimistic locking? I guess it depends on how the optimistic locking is implemented, but then I still can't think of the examples.

    As for the fact that I the serialization methods I use (Hessian, AMF/BlazeDS) apparently do not keep associations intact, I guess the problem here is that I am using domain objects as data transfer objects. That is a practice encouraged by the fact that you get away with in a lot of cases thanks to stuff like the ModelAndView and the WebDataBinder in Spring Web MVC.

    When you use associations (especially ones that would make you walk the entire object graph) you usually do not want them all to be serialized, leaving you with this problem.

    In fact I actually write separate DTO's for my more complex domain models and then you always have manually copy/merge the received data to your domain object.

  6. [...] Saving (detached) entities [...]

  7. Marcell Manfrin - Reply

    July 22, 2009 at 3:48 pm

    I use this:

    JpaDao {
      public void persist(E entity) {
        if (entity.getId() == null) {
          entityManager.persist(entity);
        } else {
          if (!entityManager.contains(entity)) {
            entityManager.merge(entity);
          }
        }
      }
    }
    
  8. Constantine - Reply

    November 7, 2009 at 8:09 pm

    Thank you for a comprehensive article!

    In our scenario the data comes from an external source in XML format (we use JAXB to unmarshal it), and needs to be merged with the one already in the DB. We use entityManager.merge in all cases.

    In fact, the major problem is the transfer protocol - the data is coming as a complex deeply nested XML (think of the complete invoice, including information about the customer), so the DIY merge is problematic. We have chosen to specify the meaningful IDs in our entity beans and now it works completely transparently. The major problem was with the compound IDs and complex relationships between entities (and both simultaneously), but those were sorted eventually... Though, at this stage there's a high risk for application to lose its independence from the JPA provider (for example, ours now works with OpenJPA only).

    To speed up the development we successfully use HyperJAXB 3 (http://confluence.highsource.org/display/HJ3/Home), which generates 90% of the code, though still need to handcraft some JPA annotations anyway.

  9. Oscar Calderon - Reply

    November 27, 2009 at 10:49 pm

    Hi, i have a big doubt about some key concepts about JPA. What's the difference between a detached entity and a new entity?

    Regards.

  10. Vincent Partington - Reply

    December 5, 2009 at 3:51 pm

    @Oscar Calderon: Sorry for not replying earlier. I see that Jeanne Boyarsky has already answered your question over at CodeRanch. ;-)

    But I'll answer your question for other readers of this blog: the difference is that a new entity is one that has just been created in Java but has not been persisted in the database yet, while a detached entity exists in Java and in the database but the Java entity is no longer attached to a session so any updates to it will not be reflected in the database.

  11. Fabrice Leray - Reply

    February 11, 2010 at 12:38 am

    I have a problem I'd like to tell you about : it's about jpa, entity, modification, undo (cancel modification)...

    A) The context :
    Hi, for my project, I used Seam and its Seam Managed Persistence Context. Although I know this context is designed for Conversation scopes, I did not use them at all (all my application is Session scoped).

    B) The problem
    I had to make forms for different elements. Let us consider the "Country form". I had to fill the form (name, currency...), view it and modify it. However, while modifying the form, I had to be able to perform a classic undo (using a hit on my cancel button) and then retrieve the state of my country as it was before I began doing updates...

    C) The unsuccessful attempts and a first solution :
    I just have to say it was a pity for me to do that using JPA. I tried all the methods the entityManager gave me (refresh and find mainly) to retrieve the first state of my country object... unsuccessfully :( Finally I made a clone object of me entity, perform the modifications on it and, when clicking the Validate button, re-copy the clone fields onto the real persisted object.
    The clone technic drawbacks :
    - The mechanic is quite simple but boring to implement...
    - Easy to implement when you have "hello world" sample with 3 or 4 entities living quite independently from each other. But it's a pity to maintain when you have dozens of entities all of them being part or collection of the other...

    D) Another solution
    So although it works, it was a pain to code and maintain. Taken good resolutions for this new year, I searched for other methods and found another one : try to detach the entity using the "evicts" method provided by Hibernate (which is the provider I used). No more clone here : you use directly the object from the database BUT you cut temporarily the connection between the database and Java. Doing so, I can finally persist the changes... or not (undo)! Great but... I see 2 drawbacks using this method :
    - it is provider dependent (Hibernate in this case, although I suppose the same method may exist for other JPA implementation)
    - the code is quite ugly because you have to load all the collections depending on the object BEFORE you perform this detach : if not, be prepare to have some nice LazyInitializationException...

    As I said in a previous post, I'm quite new to JPA, so maybe I missed something but I greatly appreciate if someone could tell me whether or not he ever met such problems and what kind of solutions he had.

    Thanks again for the great articles.

  12. Vincent Partington - Reply

    February 15, 2010 at 9:14 pm

    Fabrice Leray: I am not familiar with Seam and the way it manages the persistence context. From what I understand your problem is that the context is scoped too large. I always use a request scoped session and that makes the undo pretty easy: just throw an exception and the transaction gets rolled back immediately.

    Quite straightforward actually or maybe I'm not understanding you correctly. Could you post an example?

  13. JPA Implementation Patterns | Upthrust - Reply

    April 12, 2010 at 6:09 am

    [...] Saving (detached) Entities [...]

  14. Rodrigo Villalba - Reply

    June 28, 2010 at 4:29 pm

    My problem is that I have an object A which contains a list of B Objects

        @Entity
        class A {
        
           @OneToMany(cascade={CascadeType.MERGE})
           List list;
        
        }
    

    When I make a "merge" of an object A and then call "flush" inside a stateless EJB method

        em.merge(a); //a is of class A
        em.flush(); //doesn't flush "list"
    

    it actually doesn't work. the ids of B objects of "list" are not setted.

    But persisting and the flushing works

        em.persist(a);
        em.flush(); // it works!
    

    The ids of B object of "list" are setted.

    I'm using EclipseLink. Why is happening this?

  15. Gnanasekaran - Reply

    July 14, 2010 at 7:07 am

    This is excelent article and it is very useful for me to understand the difference between merge and saveorupdate.
    Before this i read so many articles regarding this topic but none of them are clear.

  16. Lars Bohl - Reply

    August 23, 2010 at 9:31 am

    I would like to exempt a managed object from the flush(), but keep it managed after the flush, like this (pseudo-code):

    Object managedObject;
    javax.persistence.EntityManager entityManager;
    if (entityManager.contains(managedObject))
    {
      entityManager.disattach(managedObject);
      entityManager.flush();
      entityManager.reattach(managedObject);
    }
    

    Of course, the methods entityManager.disattach(Object o) and entityManager.reattach(Object o) are fantasy, so this is pseudo-code. Is there a way do this with JPA 1, and what will it be like with JPA 2?

  17. Andi - Reply

    October 13, 2010 at 1:14 pm

    Hi Lars,

    I can not see any reason for your pseudo code. Even if it would exist -- since u reattach the object again -- on the commit, it gets 'flushed' to the DB. To me it looks like the search for a workaround.

    cheers

  18. Uday Kari - Reply

    March 17, 2012 at 12:51 pm

    Bit tricky but merge CAN be made to behave exactly like saveOrUpdate using the id generated key. For my object I used the property "id" as generated key. If this is id zero or not already in database, the entity manager merge call will create a new record (persist object in JPA lingo). However, if that id is non-zero that already exists in database, merge will update the object.

  19. sudhi - Reply

    March 21, 2012 at 4:13 pm

    we are having a problem with JPA merge.. we have a customer table and address table with onetomany relationship. so we created a intermediate table 'customeraddress'. when we go for the customer update test case, we have the customerid and addressid, but the id for customeraddress object is not available.. so its inserting a new row for customeraddress, there by duplicating the customeraddress row. we try to use the merge on the customer object with cascadetype 'ALL'. how to sovle this problem?

  20. [...] JPA implementation patterns: Saving (detached) entities [...]

  21. Incubadora | Pearltrees - Reply

    April 18, 2012 at 3:06 pm

    [...] JPA implementation patterns: Saving (detached) entities | Xebia Blog SessionManager.saveOrUpdate Object passed was never persisted EntityManager.merge 1. Object added to persistence context as new entity [...]

  22. [...] which well executed the transaction. I enquired for that merge() method, and I learnt here and there the main point: it copies the state of the passed object. [...]

  23. Rey Gov - Reply

    September 18, 2012 at 9:37 am

    These discussions makes me think twice about using Hibernate the next time...Everything is so boring complicated!
    After you guys have fixed all your merge problems...have a look at the overload of SQL statements Hibernate produces!

  24. josvazg - Reply

    March 4, 2013 at 11:59 am

    I have to agree with Rev Gov, why does Hibernate have to make something simple into something complex and then JPA make it even WORSE!

    ORM should not be so complex nor SO automatic. They try to hide too much details from programmers. It seems like they pretend to make them forget the important facts when dealing with a persistent storage:

    Fact #1, most storage systems are network exported or shared services, whatever the ORM tool/frameworks says, the storage state is the ONE and ONLY that rules. So it does not pay too much to manage or cache "the objects", the only moment that matters are when you READ the storage and when you WRITE it, cause many things can happen in between to the stored data you are dealing with. You cannot always assume you are the only one accessing or changing it.

    Fact #2, SQL and even noSQL storages have the notion of INSERTING new data and UPDATING existing data. Even people that do not know much about databases can understand those simple concepts, why do THEY had to invent the confusing and complex "persist" and "merge"?
    I really don't understand it.

    Object Oriented people have the "notion" of SAVE. You have a struct or an object state that you want to just "save" so you can refer to it later by the id or some other combination of attribute values. This maps really easily with Relational Databases (and I guess also non relational ones):

    "SAVE is UPDATE it NOW if it (the identifier) already exist OR INSERT otherwise"

    Why do THEY had to make this simple thing so complicated?

  25. Alex - Reply

    December 27, 2013 at 3:07 pm

    Another problem not considered is when the new object A is associated to a detached object B. If you try to persist A, even when A has CASCADE.MERGE to B, it will gave us the detached error because B is not in the manager. But if you Merge A, it will first reattach (merge) B, and then persist (insert) A.
    It just happens that in most web applications the many-to-one turns out to became a select box+converter, and it is a pain to explicitly merge each many-to-one member by hand.

Add a Comment