Chapter 9. Manipulating entities with EntityManager – EJB 3 in Action

Chapter 9. Manipulating entities with EntityManager

This chapter covers

  • Persistence context and its scope
  • Container- and application-managed entity managers
  • Entity lifecycle and CRUD operations
  • Detachment and merge operations
  • Entity lifecycle listeners

In chapter 7 you learned how to develop the application domain model using JPA. In chapter 8 you saw how domain objects and relationships are mapped to the database. While the ORM annotations we explored in chapter 8 indicate how an entity is persisted, the annotations don’t do the actual persisting. This is performed by applications using the EntityManager interface, the topic of this chapter.

To use an analogy, the domain model annotated with ORM configuration is like a children’s toy that needs assembly. The domain model consists of the parts of the toy. The ORM configuration is the assembly instructions. While the assembly instructions tell you how the toy parts are put together, you do the actual assembly. The EntityManager is the toy assembler of the persistence world. It figures out how to persist the domain by looking at the ORM configuration. More specifically, the EntityManager performs CRUD (Create, Read, Update, Delete) operations on domain objects. The Read part includes extremely robust search and retrieval of domain objects from the database. This chapter covers each of the CRUD operations that the EntityManager provides, with the exception of the search part of search and retrieval. In addition to simple primary key–based domain object retrieval, JPA provides SQL SELECT–like search capabilities through the EJB 3 query API. The query API is so extensive and powerful that we’ll dedicate chapter 10 to it while briefly touching on it in this one.

Before we dive down into the persistence operations, you’ll learn about the EntityManager interface, the lifecycle of entities, the concept of persistence context, and how to obtain an instance of EntityManager. We’ll discuss entity lifecycle callback listeners before concluding with best practices.

Before we get into too much code, we’re going to gently introduce the EntityManager and briefly cover some concepts useful in understanding the nuances behind this critical part of JPA.

9.1. Introducing the EntityManager

The EntityManager API is probably the most important and interesting part of the Java Persistence API. It manages the lifecycle of entities. In this section you’ll learn about the EntityManager interface and its methods. We’ll explore the entity lifecycle, and you’ll also learn about persistence contexts and their types.

9.1.1. The EntityManager interface

In a sense, the EntityManager is the bridge between the OO and relational worlds, as depicted in figure 9.1. When you request that a domain entity be created, the EntityManager translates the entity into a new database record. When you request that an entity be updated, it tracks down the relational data that corresponds to the entity and updates it. Likewise, the EntityManager removes the relational data when you request that an entity be deleted. From the other side of the translation bridge, when you request that an entity be “saved” in the database, the EntityManager creates the Entity object, populates it with relational data, and “returns” the Entity back to the OO world.

Figure 9.1. The EntityManager acts as a bridge between the OO and relational worlds. It interprets the O/R mapping specified for an entity and saves the entity in the database.

Besides providing these explicit SQL-like CRUD operations, the EntityManager also quietly tries to keep entities synched with the database automatically as long as they are within the EntityManager’s reach (this behind-the-scenes synchronization is what we mean when we talk about “managed” entities in the next section). The EntityManager is easily the most important interface in JPA and is responsible for most of the ORM black magic in the API.

Despite all this under-the-hood power, the EntityManager is a small, simple, and intuitive interface, especially compared to the mapping steps we discussed in the previous chapter and the query API (which we explore in chapter 10). In fact, once we go over some basic concepts in the next few sections, the interface might seem almost trivial. You might already agree if you take a quick look at table 9.1. The table lists some of the most commonly used methods defined in the EntityManager interface.

Table 9.1. The EntityManager is used to perform CRUD operations. Here are the most commonly used methods of the EntityManager interface.

Method Signature

Description

public void persist(Object entity);

Saves (persists) an entity into the database, and also makes the entity managed.

public <T> T merge(T entity);

Merges an entity to the EntityManager’s persistence context and returns the merged entity.

public void remove(Object entity);

Removes an entity from the database.

public <T> T find(Class<T> entityClass, Object primaryKey);

Finds an entity instance by its primary key.

public void flush();

Synchronizes the state of entities in the EntityManager’s persistence context with the database.

public void setFlushMode(FlushModeType flushMode);

Changes the flush mode of the EntityManager’s persistent context. The flush mode may either be AUTO or COMMIT. The default flush mode is AUTO, meaning that the EntityManager tries to automatically synch the entities with the database.

public FlushModeType getFlushMode();

Retrieves the current flush mode.

public void refresh(Object entity);

Refreshes (resets) the entity from the database.

public Query createQuery(String jpqlString);

Creates a dynamic query using a JPQL statement.

public Query createNamedQuery(String name);

Creates a query instance based on a named query on the entity instance.

public Query createNativeQuery(String sqlString);

public Query createNativeQuery(String sqlString, Class result Class);

public Query createNativeQuery(String sqlString, String resultSetMapping);

Creates a dynamic query using a native SQL statement.

public void close();

Closes an application-managed Entity-Manager.

public boolean isOpen();

Checks whether an EntityManager is open.

public EntityTransaction getTransaction();

Retrieves a transaction object that can be used to manually start or end a transaction.

public void joinTransaction();

Asks an EntityManager to join an existing JTA transaction.

Don’t worry too much if the methods are not immediately obvious. Except for the methods related to the query API (createQuery, createNamedQuery, and createNativeQuery), we’ll discuss them in detail in the coming sections. The few EntityManager interface methods that we didn’t just cover are rarely ever used, so we won’t spend time discussing them. Once you’ve read and understood the material in this chapter, though, we encourage you to explore them on your own. The EJB 3 Java Persistence API final specification is available at http://jcp.org/en/jsr/detail?id=220.


The JPA entity: a set of trade-offs

Nothing in life is free...

As we mentioned (and will explore further in this and future chapters), the reworked EJB 3 JPA entity model brings a whole host of features to the table: simplicity, OO support, and unit testability, to name a few. However, the JPA entity loses some that were available in the EJB 2 model because of its separation from the container. Because the EntityManager and not the container manages entities, they cannot directly use container services such as dependency injection, the ability to define method-level transaction and declarative security, remoteability, and so on.

However, the truth of the matter is that most layered applications designed using the EJB 2 CMP entity bean model never utilized container services directly anyway. This is because entity beans were almost always used through a session bean façade. This means that entity beans “piggybacked” over container services configured at the session bean level. The same is true for JPA entities. In real terms, the JPA model loses very little functionality.

Incidentally, losing the magic word “bean” means that JPA entities are no longer EJB components managed by the container. So if you catch someone calling JPA entities “beans,” feel free to gently correct them!


Even though JPA is not container-centric like session beans or MDBs, entities still have a lifecycle. This is because they are “managed” by JPA in the sense that the persistence provider keeps track of them under the hood and even automatically synchronizes entity state with the database when possible. We’ll explore exactly how the entity lifecycle looks in the following section.

9.1.2. The lifecycle of an entity

An entity has a pretty simple lifecycle. Making sense of the lifecycle is easy once you grasp a straightforward concept: the EntityManager knows nothing about a POJO regardless of how it is annotated, until you tell the manager to start treating the POJO like a JPA entity. This is the exact opposite of POJOs annotated to be session beans or MDBs, which are loaded and managed by the container as soon as the application starts. Moreover, the default behavior of the EntityManager is to manage an entity for as short a time as possible. Again, this is the opposite of container-managed beans, which remain managed until the application is shut down.

An entity that the EntityManager is keeping track of is considered attached or managed. On the other hand, when an EntityManager stops managing an entity, the entity is said to be detached. An entity that was never managed at any point is called transient or new. attached.

Figure 9.2 summarizes the entity lifecycle.

Figure 9.2. An entity becomes managed when you persist, merge, refresh, or retrieve an entity. It may also be attached when we retrieve it. A managed entity becomes detached when it is out of scope, removed, serialized, or cloned.

Let’s take a close look at the managed and detached states.

Managed entities

When we talk about managing an entity’s state, what we mean is that the EntityManager makes sure that the entity’s data is synchronized with the database. The EntityManager ensures this by doing two things. First, as soon as we ask an EntityManager to start managing an entity, it synchronizes the entity’s state with the database. Second, until the entity is no longer managed, the EntityManager ensures that changes to the entity’s data (caused by entity method invocations, for example) are reflected in the database. The EntityManager accomplishes this feat by holding an object reference to the managed entity and periodically checking for data freshness. If the EntityManager finds that any of the entity’s data has changed, it automatically synchronizes the changes with the database. The EntityManager stops managing the entity when the entity is either deleted or moves out of persistence provider’s reach.

An entity can become attached to the EntityManager’s context when you pass the entity to the persist, merge, or refresh method. Also an entity becomes attached when you retrieve using the find method or a query within a transaction. The state of the entity determines which method you will use.

When an entity is first instantiated as in the following snippet, it is in the new or transient state since the EntityManager doesn’t know it exists yet:

Bid bid = new Bid();

Hence the entity instance is not managed yet. It will become managed if the EntityManager’s persist method creates a new record in the database corresponding to the entity. This would be the most natural way to attach the Bid entity in the previous snippet to the EntityManager’s context:

manager.persist(bid);

A managed entity becomes detached when it is out of scope, removed, serialized, or cloned. For example, the instance of the Bid entity will become detached when the underlying transaction commits.

Unlike entities explicitly created using the new operator, an entity retrieved from the database using the EntityManager’s find method or a query is attached if retrieved within a transactional context. A retrieved instance of the entity becomes detached immediately if there is no associated transaction.

The merge and refresh methods are intended for entities that have been retrieved from the database and are in the detached state. Either of these methods attaches entities to the entity manager. EntityManager.merge updates the database with the data held in the entity, and refresh does the opposite—it resets the entity’s state with data from the database. We’ll discuss these methods in much greater detail in section 9.3.

Detached entities

A detached entity is an entity that is no longer managed by the EntityManager and there is no guarantee that the state of the entity is in synch with the database. Detachment and merge operations become handy when you want to pass an entity across application tiers. For example, you can detach an entity and pass it to the web tier, then update it and send it back to the EJB tier, where you can merge the detached entity to the persistence context.

The usual way entities become detached is a little subtler. Essentially, an attached entity becomes detached as soon as it goes out of the EntityManager context’s scope. Think of this as the expiration of the invisible link between an entity and the EntityManager at the end of a logical unit of work or a session. An EntityManager session could be limited to a single method call or span an arbitrary length of time. (Reminds you of session beans, doesn’t it? As you’ll soon see, this is not entirely an accident.) For an EntityManager whose session is limited to a method call, all entities attached to it become detached as soon as a method returns, even if the entity objects are used outside the method. If this is not absolutely crystal clear right now, it will be once we talk about the EntityManager persistence context in the next section.

Entity instances also become detached through cloning or serialization. This is because the EntityManager quite literally keeps track of entities through Java object references. Since cloned or serialized instances don’t have the same object references as the original managed entity, the EntityManager has no way of knowing they exist. This scenario occurs most often in situations where entities are sent across the network for session bean remote method calls.

In addition, if you call the clear method of EntityManager, it forces all entities in the persistence context to be detached. Calling the EntityManager’s remove method will also detach an entity. This makes perfect sense since this method removes the data associated with the entity from the database. As far as the EntityManager is concerned, the entity no longer exists, so there is no need to continue managing it. For our Bid entity, this would be an “apt demise”:

manager.remove(bid);

We’ll return to this discussion on detachment and merge operations in section 9.3.3.

A good way to remember the entity lifecycle is through a convenient analogy. Think of an entity as an aircraft and the EntityManager as the air traffic controller. While the aircraft is outside the range of the airport (detached or new), it is not guided by the air traffic controller. However, when it does come into range (managed), the traffic controller manages the aircraft’s movement (state synchronized with database). Eventually, a grounded aircraft is guided into takeoff and goes out of airport range again (detached), at which point the pilot is free to follow her own flight plan (modifying a detached entity without state being managed).

The persistence context scope is the equivalent of airport radar range. It is critical to understand how the persistence context works to use managed entities effectively. We’ll examine the relationship between the persistence context, its scope, and the EntityManager in the next section.

9.1.3. Persistence contexts, scope, and the EntityManager

The persistence context plays a vital role in the internal functionality of the EntityManager. Although we perform persistence operations by invoking methods on the EntityManager, the EntityManager itself does not directly keep track of the lifecycle of an individual entity. In reality, the EntityManager delegates the task of managing entity state to the currently available persistence context.

In a very simple sense, a persistence context is a self-contained collection of entities managed by an EntityManager during a given persistence scope. The persistence scope is the duration of time a given set of entities remains managed.

The best way to understand this is to start by examining what the various persistence scopes are and what they do and then backtracking to the meaning of the term. We’ll explain how the persistence context and persistence scope relates to the EntityManager by first exploring what the persistence context is.

There are two different types of persistence scopes: transaction and extended.

Transaction-scoped EntityManager

An EntityManager associated with a transaction-scoped persistence context is known as a transaction-scoped EntityManager. If a persistence context is under transaction scope, entities attached during a transaction are automatically detached when the transaction ends. (All persistence operations that may result in data changes must be performed inside a transaction, no matter what the persistence scope is.) In other words, the persistence context keeps managing entities while the transaction it is enclosed by is active. Once the persistence context detects that a transaction has either been rolled back or committed, it will detach all managed entities after making sure that all data changes until that point are synchronized with the database. Figure 9.3 depicts this relationship between entities, the transaction persistence scope, and persistence contexts.

Figure 9.3. Transaction-scoped persistence contexts only keep entities attached within the boundaries of the enclosing transaction.

Extended EntityManager

The life span of the extended EntityManager lasts across multiple transactions. An extended EntityManager can only be used with stateful session beans and lasts as long as the bean instance is alive. Therefore, in persistence contexts with extended scope, how long entities remain managed has nothing to do with transaction boundaries. In fact, once attached, entities pretty much stay managed as long as the EntityManager instance is around. As an example, for a stateful session bean, an EntityManager with extended scope will keep managing all attached entities until the EntityManager is closed as the bean itself is destroyed. As figure 9.4 shows, this means that unless explicitly detached through a remove method to end the life of the stateful bean instance, entities attached to an extended persistence context will remain managed across multiple transactions.

Figure 9.4. For an extended persistence context, once an entity is attached in any given transaction, it is managed for all transactions in the lifetime of the persistence context.

The term scope is used for persistence contexts in the same manner that it is used for Java variable scoping. It describes how long a particular persistence context remains active. Transaction-scoped persistence contexts can be compared to method local variables, in the sense that they are only in effect within the boundaries of a transaction. On the other hand, persistence contexts with extended scope are more like instance variables that are active for the lifetime of an object—they hang around as long as the EntityManager is around.

At this point, we’ve covered the basic concepts needed to understand the functionality of the EntityManager. We are now ready to see the EntityManager itself in action.

9.1.4. Using the EntityManager in ActionBazaar

We’ll explore the EJB 3 EntityManager interface by implementing an ActionBazaar component. We’ll implement the ItemManagerBean stateless session bean used to provide the operations to manipulate items. As listing 9.1 demonstrates, the session bean provides methods for adding, updating, and removing Item entities using the JPA EntityManager.

Listing 9.1. ItemManagerBean using the container-managed, transaction-scoped entity manager

ItemManagerBean is a pretty good representation of the most common ways the EntityManager API is used. First, an instance of the EntityManager is injected using the @PersistenceContext annotation and is used in all the session bean methods that manipulate entities. As you might imagine, the addItem method is used by the presentation layer to add an item posted by the seller to the database. The persist method is used by addItem to add a new Item entity to the database . The addItem method also uses the EntityManager’s find method to retrieve the Seller of the Item using the entity’s primary key . The retrieved Seller entity is set as an association field of the newly instantiated Item entity along with all other item data.

The updateItem method updates the Item entity data in the database using the merge method . This method could be invoked from an administrative interface that allows a seller to update a listing after an item is posted. The EntityManager’s refresh method is used in the undoItemChanges method to discard any changes made to an Item entity and to reload it with the data stored in the database . The undoItemChanges method could be used by the same administrative interface that uses the updateItem method to allow the user to start over with modifying a listing (think of an HTTP form’s “reset” button).

Lastly, an Item entity is removed from the database using the remove method . This method could be used by an ActionBazaar administrator to remove an offending listing.

Now that we’ve “surface-scanned” the code in listing 9.1, we’re ready to start our in-depth analysis of the EntityManager API. We’ll start from the most logical point: making an EntityManager instance available to the application.

9.2. Creating EntityManager instances

EntityManager is like the conductor of an orchestra who manages the show. Say you’d like to bring the orchestra to your town for a performance. You first get in touch with and attempt to hire the conductor. Similarly, the first obvious step to performing any persistence operation is obtaining an instance of an EntityManager.

In listing 9.1, we do this by injecting an instance using the @PersistenceContext annotation. If you are using a container, this is more or less all you will need to know about getting an instance of an EntityManager.

All EntityManager instances injected using the @PersistenceContext annotation are container managed. This means that the container takes care of the mundane task of looking up, opening, and closing the EntityManager behind the scenes. In addition, unless otherwise specified, injected EntityManagers have transaction scope. Just as you aren’t limited to using the transaction scope, you are not limited to using a container-managed EntityManager either.

JPA fully supports creating application-managed EntityManagers that you explicitly create, use, and release, including controlling how the EntityManager handles transactions. This capability is particularly important for using JPA outside the container.

In this section we’ll explore how to create and use both container- and application-managed EntityManagers.

9.2.1. Container-managed EntityManagers

As you saw earlier, container-managed EntityManagers are injected using the @PersistenceContext annotation. Let’s take a look at the definition of the annotation to start exploring its features:

@Target({TYPE, METHOD, FIELD})
@Retention(RUNTIME)
public @interface PersistenceContext {
String name() default "";
String unitName() default "";
PersistenceContextType type default TRANSACTION;
PersistenceProperty[] properties() default {};
}

The first element of the annotation, name, specifies the JNDI name of the persistence context. This element is used in the unlikely case you have to explicitly mention the JNDI name for a given container implementation to be able to look up an EntityManager. In most situations, leaving this element empty is fine, except when you use @PersistenceContext at the class level to establish reference to the persistence context.

The unitName element specifies the name of the persistence unit. A persistence unit is essentially a grouping of entities used in an application. This idea is useful when you have a large Java EE application and would like to separate it into several logical areas (think Java packages). For example, ActionBazaar entities could be grouped into general and admin units.

Persistence units cannot be set up using code, and you must configure them through the persistence.xml deployment descriptor. We’ll return to the topic of configuring persistence units in chapter 11; for now, all you need to understand is that you can get an EntityManager for the admin unit using the unitName element as follows:

@PersistenceContext(unitName="admin")
EntityManager entityManager;

In the typical case that a Java EE module has a single persistence unit, specifying the unitName might seem redundant. In fact, most persistence providers will resolve the unit correctly if you don’t specify a unitName. However, we recommend specifying a persistence unit name even if you only have one unit. This ensures that you are not dependent on container-specific functionality since the specification doesn’t state what the persistence provider must do if the unitName is not specified.

EntityManager scoping

The element type specifies the EntityManager scope. As we noted, for a container-managed EntityManager, scope can either be TRANSACTION or EXTENDED. If the type element is left empty, the scope is assumed to be TRANSACTION. Not surprisingly, the typical use of the type element is to specify EXTENDED for an EntityManager. The code would look like this:

@PersistenceContext(type=PersistenceContextType.EXTENDED)
EntityManager entityManager;

You are not allowed to use extended persistence scope for stateless session beans or MDBs. If you stop and think for second, the reason should be pretty obvious. The real reason for using extended scope in a bean would be to manage entity state across multiple method calls, even if each method call is a separate transaction. Since neither sessions nor message-driven beans are supposed to implement such functionality, it makes no sense to support extended scope for these bean types. On the other hand, extended persistence scope is ideal for stateful session beans. An underlying EntityManager with extended scope could be used to cache and maintain the application domain across an arbitrary number of method calls from a client. More importantly, you could do this and still not have to give up method-level transaction granularity (most likely using CMT). We’ll return to our discussion of how you can use an extended persistence context for stateful session beans as an effective caching mechanism in chapter 13.


Note

The real power of container-managed EntityManagers lies in the high degree of abstraction they offer. Behind the scenes, the container instantiates EntityManagers, binds them to JNDI, injects them into beans on demand, and closes them when they are no longer needed (typically when a bean is destroyed).


It is difficult to appreciate the amount of menial code the container takes care of until you see the alternative. Keep this in mind when looking at application-managed EntityManagers in the coming section. Note that container-managed EntityManagers are available to all components in a Java EE container, including JSF-backing beans and servlets. However, be aware of the fact that EntityManagers are not thread-safe; thus, injecting them frivolously into presentation-layer components can readily land you in some trouble. We’ll discuss this nuance in the sidebar “EntityManagers and thread safety.”


EntityManagers and thread safety

It is very easy to forget the fact that web-tier components are meant to be used by multiple concurrent threads. Servlet-based components like JSPs are deliberately designed this way because they are intended to achieve high throughput through statelessness.

However, this fact means that you cannot use resources that are not thread-safe as servlet instance variables. The EntityManager falls under this category, so injecting it into a web component is a big no-no. One way around this problem is to use the SingleThreadModel interface to make a servlet thread-safe. In practice, this technique severely degrades application performance and is almost always a bad idea.

Some vendors might try to solve this problem. If you are extremely comfortable with your container vendor, you could count on this. Remember, though, one very important benefit of using EJB 3 is portability across container implementations. You shouldn’t give up this advantage frivolously.

If you must use container-managed EntityManagers from your servlet, the best option is to look up an instance of the EntityManager inside your method as follows:

@PersistenceContext(name="pu/actionBazaar",
unitName="ActionBazaar")
public class ItemServlet extends HttpServlet {
@Resource private UserTransaction ut;
public void service (HttpServletRequest req,
HttpServletResponse resp)
throws ServletException, IOException {

Context ctx = new InitialContext();
EntityManager em =(EntityManager)
ctx.lookup(
"java:comp/env/pu/actionBazaar");
...
ut.begin();
em.persist(item);
ut.commit();
...
}
...
}

The other alternative is to use an application-managed EntityManager with a JTA transaction. It is worth noting that EntityManagerFactory is thread-safe.


Now that you are familiar with working with EntityManager inside the container, let’s move on to the application-managed EntityManager.

9.2.2. Application-managed EntityManager

An application-managed EntityManager wants nothing to do with a Java EE container. This means that you must write code to control every aspect of the EntityManager’s lifecycle. By and large, application-managed EntityManagers are most appropriate for environments where a container is not available, such as Java SE or a lightweight web container like Tomcat.

However, a justification to use application-managed EntityManagers inside a Java EE container is to maintain fine-grained control over the EntityManager lifecycle or transaction management. For this reason, as well as to maintain flexibility, the EJB 3 API provides a few conveniences for using application-managed EntityManagers inside a container. This happens to suit us well too, because it provides an effective approach to exploring application-managed EntityManagers (see listing 9.2) by reusing the code in listing 9.1.

Listing 9.2. ItemManagerBean using an application-managed EntityManager

It should be fairly obvious that we are more or less explicitly doing what the container did for us behind the scenes in listing 9.1. First we inject an instance of EntityManagerFactory . We create an EntityManager using the injected entity manager factory after the bean is constructed and close it before the bean is destroyed, mirroring what the container does automatically. The same is true when we explicitly join a container-managed JTA transaction before performing an EntityManager operation.

EntityManagerFactory

As you can see in listing 9.2, we get an instance of an application-managed EntityManager using the EntityManagerFactory interface. If you have used JDBC, this is essentially the same idea as creating a Connection from a DriverManager factory. In a Java EE environment, you have the luxury of using the @PersistenceUnit annotation to inject an instance of an EntityManagerFactory, just as we do in listing 9.2. This useful annotation is defined as follows:

@Target({TYPE, METHOD, FIELD})
@Retention(RUNTIME)
public @interface PersistenceUnit {
String name() default "";
String unitName() default "";
}

The name and unitName elements serve exactly the same purpose as they do for the @PersistenceContext annotation. While the name element can be used to point to the JNDI name of the EntityManagerFactory, the unitName element is used to specify the name of the underlying persistence unit.

Figure 9.5 shows the relationships between important interfaces made available by JPA outside the container.

Figure 9.5. These relationships exist between various important classes in the javax.persistence package for using JPA outside the Java EE container.

The EntityManagerFactory’s createEntityManager method creates an application-managed EntityManager. This is probably the most commonly used method in the interface, in addition to the close method. We don’t explicitly close the factory in listing 9.2 since the container takes care of cleaning up all resources it injects (unlike the EntityManager, which is created programmatically and is explicitly closed). Table 9.2 lists all the methods in the EntityManagerFactory interface. As you can see, most of them are fairly self-explanatory.

Table 9.2. The EntityManager factory is used to create an instance of an application-managed EntityManager.

Method

Purpose

EntityManager create EntityManager()

Creates an application-managed EntityManager.

EntityManager create EntityManager (Map map)

Creates an application-managed EntityManager with a specified Map. The Map contains vendor-specific properties to create the manager.

void close()

Closes the EntityManagerFactory.

boolean isOpen()

Checks whether the EntityManagerFactory is open.

We have seen the usage of most of the EntityManagerFactory methods in listing 9.2. Perhaps the most interesting aspect of listing 9.2 is the entityManager.joinTransaction () call in the updateItem method. Let’s discuss this method in a little more detail.

As we hinted in the beginning of this section, unlike container-managed EntityManagers, application-managed EntityManagers do not automatically participate in an enclosing transaction. Instead, they must be asked to join an enclosing JTA transaction by calling the joinTransaction method. This method is specifically geared to using application-managed EntityManagers inside a container, where JTA transactions are usually available.

Application-managed EntityManagers outside the Java EE container

In Java SE environments, however, JTA is not a possibility. Resource-local transactions must be used in place of JTA for such environments. The EntityTransaction interface is designed with exactly this scenario in mind. We’ll explore this interface by reimplementing the code to update an item from listing 9.2 for an SE application. Listing 9.3 also serves the dual purpose of being a good template for using application-managed EntityManagers without any help from the container.

Listing 9.3. Using an application-managed EntityManager outside a container

The first thing that should strike you about listing 9.3 is the amount of boilerplate code involved to accomplish exactly the same thing as the updateItem method in listing 9.1 (cutting and pasting both code snippets into a visual editor and comparing the code side-by-side might be helpful in seeing the full picture at a glance).

The javax.persistence.Persistence object’s create EntityManagerFactory method used in listing 9.3 is essentially a programmatic substitute for the @PersistenceUnit annotation. The single parameter of the createEntityManagerFactory method is the name of the EntityManagerFactory defined in the persistence.xml deployment descriptor. Since the container is no longer helping us out, it is now important to be sure to close the EntityManagerFactory returned by the createEntityManagerFactory method when we are done, in addition to closing the EntityManager . Just as in listing 9.2, the EntityManagerFactory’s create EntityManager method is used to create the application-managed EntityManager .

However, before merging the Item entity to the database , we now create an EntityTransaction by calling the getTransaction method of the EntityManager . The EntityTransaction is essentially a high-level abstraction over a resource-local JDBC transaction, as opposed to the distributed JTA transaction we joined in listing 9.2. At first it is natural to think that a joinTransaction call is still necessary to make the EntityManager aware of the enclosing transaction. Remember that since the EntityManager itself is creating the transaction instead of the container, it implicitly keeps track of EntityTransaction s, so the join is not necessary. The rest of the transaction code—the begin and the commit —do exactly what you’d expect. As might be obvious, the EntityTransaction interface also has a rollback method to abort the transaction. Note that application-managed EntityManagers are never transaction scoped. That is, they keep managing attached entities until they are closed. Also the transaction-type must be set to RESOURCE_LOCAL in the persistence.xml file for using the EntityTransaction interface (we’ll discuss this further in chapter 11 when we talk about EJB 3 packaging and deployment).

Although there is little doubt that the code in listing 9.3 is pretty verbose and error prone, being able to use application-managed EntityManagers outside the confines of the container accomplishes the vital goal of making standardized ORM accessible to all kinds of applications beyond server-side enterprise solutions, including Java Swing-based desktop applications, as well as enabling integration with web containers such as Tomcat or Jetty.

Using JPA in a web container and the ThreadLocal pattern

If you are using an application-managed entity manager in a web container such as Tomcat or Jetty, some persistence providers such as Hibernate recommend that you use the ThreadLocal pattern. This is widely known as the ThreadLocal session pattern in the Hibernate community. It associates a single instance of the EntityManager with a particular request. You have to bind the EntityManager to a thread-local variable and set the EntityManager instance to the associated thread, as shown here:

Check your persistence provider’s documentation if it requires you to use the ThreadLocal pattern.

Next, we’ll tackle the most important part of this chapter: EntityManager operations.

9.3. Managing persistence operations

The heart of the JPA API lies in the EntityManager operations, which we’ll discuss in upcoming sections. As you might have noted in listing 9.1, although the EntityManager interface is small and simple, it is pretty complete in its ability to provide an effective persistence infrastructure. In addition to the CRUD (Create, Read, Update, and Delete) functionality we introduced in listing 9.1, we’ll cover a few less-commonly used operations like flushing and refreshing.

Let’s start our coverage in the most logical place: persisting new entities into the database.

9.3.1. Persisting entities

Recall that in listing 9.1, the addItem method persists an Item entity into the database. Since listing 9.1 was quite a few pages back, we’ll repeat the addItem method body as reviewed in listing 9.4. Although it is not obvious, the code is especially helpful in understanding how entity relationships are persisted, which we’ll look at in greater detail in a minute. For now, let’s concentrate on the persist method itself.

Listing 9.4. Persisting entities

A new Item entity corresponding to the record being added is first instantiated in the addItem method. All of the relevant Item entity data to be saved into the database, such as the item title and description, is then populated with the data passed in by the user. As you’ll recall from chapter 7, the Item entity has a many-to-one relationship with the Seller entity. The related seller is retrieved using the EntityManager’s find method and set as a field of the Item entity. The persist method is then invoked to save the entity into the database, as shown in figure 9.6. Note that the persist method is intended to create new entity records in the database and not update existing ones. This means that you should make sure the identity or primary key of the entity to be persisted doesn’t already exist in the database.

Figure 9.6. Invoking the persist method on the EntityManager interface makes an entity instance managed. When the transaction commits, the entity state is synchronized with the database.

If you try to persist an entity that violates the database’s integrity constraint, the persistence provider will throw javax.persistence.PersistenceException, which wraps the database exception.

As we noted earlier, the persist method also causes the entity to become managed as soon as the method returns. The INSERT statement (or statements) that creates the record corresponding to the entity is not necessarily issued immediately. For transaction-scoped EntityManagers, the statement is typically issued when the enclosing transaction is about to commit. In our example, this means the SQL statements are issued when the addItem method returns. For extended-scoped (or application-managed) EntityManagers, the INSERT statement is probably issued right before the EntityManager is closed. The INSERT statement can also be issued at any point when the EntityManager is flushed.

We’ll discuss automatic and manual flushing in more detail shortly. For now, you just need to know that under certain circumstances, either you or the EntityManager can choose to perform pending database operations (such as an INSERT to create a record), without waiting for the transaction to end or the EntityManager to close. The INSERT statement corresponding to listing 9.4 to save the Item entity could look something like this:

INSERT INTO ITEMS
(TITLE, DESCRIPTION, SELLER_ID, ...)
VALUES
("Toast with the face of Bill Gates on it",
"This is an auction for...", 1, ...)

As you may have noticed, the ITEM_ID primary key that is the identity for the Item entity is not included in the generated INSERT statement. This is because the key-generation scheme for the itemId identity field of the entity is set to IDENTITY. If the scheme were set to SEQUENCE or TABLE instead, the EntityManager would have generated a SELECT statement to retrieve the key value first and then include the retrieved key in the INSERT statement. As we mentioned earlier, all persistence operations that require database updates must be invoked within the scope of a transaction. If an enclosing transaction is not present when the persist method is invoked, a TransactionRequiredException is thrown for a transaction-scoped entity manager. If you are using an application-managed or extended entity manager, invoking the persist method will attach the entity to persistence context. If a new transaction starts, the entity manager joins the transaction and the changes will be saved to the database when the transaction commits. The same is true for the EntityManager’s flush, merge, refresh, and remove methods.

Persisting entity relationships

One of the most interesting aspects of persistence operations is the handling of entity relationships. JPA gives us a number of options to handle this nuance in a way that suits a particular application-specific situation.

Let’s explore these options by revisiting listing 9.4. The addItem method is one of the simplest cases of persisting entity relationships. The Seller entity is retrieved using the find method, so it is already managed, and any changes to it are guaranteed to be transparently synchronized. Recall from chapter 7 that there is a bidirectional one-to-many relationship between the Item and Seller entities. This relationship is realized in listing 9.4 by setting the Seller using the item.setSeller method. Let’s assume that the Seller entity is mapped to the SELLERS table. Such a relationship between the Item and Seller entities is likely implemented through a foreign key to the SELLERS.SELLER_ID column in the ITEMS table. Since the Seller entity is already persisted, all the EntityManager has to do is set the SELLER_ID foreign key in the generated INSERT statement. Examining the INSERT statement presented earlier, this is how the SELLER_ID value is set to 1. Note that if the seller property of Item were not set at all, the SELLER_ID column in the INSERT statement would be set to NULL.

Things become a lot more interesting when we consider the case in which the entity related to the one we are persisting does not yet exist in the database. This does not happen very often for one-to-many and many-to-many relationships. In such cases, the related entity is more than likely already saved in the database. However, it does occur a lot more often for one-to-one relationships. For purposes of illustration, we’ll stray from our ItemManager example and take a look at saving User entities with associated BillingInfo entities. Recall that we introduced this unidirectional, one-to-one relationship in chapter 7. The method outlined in listing 9.5 receives user information such as username, e-mail, as well as billing information such as credit card type and number, and persists both the User and related BillingInfo entities into the database.

Listing 9.5. Persisting relationships

As you can see, neither the User entity nor the related BillingInfo entity is managed when the persist method is invoked, since both are newly instantiated. Let’s assume for the purpose of this example that the User and BillingInfo entities are saved into the USERS and BILLING_INFO tables, with the one-to-one relationship modeled with a foreign key on the USERS table referencing the BILLING_ID key in the BILLING_INFO table. As you might guess from looking at listing 9.5, two INSERT statements, one for the User and the other for the BillingInfo entity, are issued by JPA. The INSERT statement on the USERS table will contain the appropriate foreign key to the BILLING_INFO table.

Cascading persist operations

Perhaps surprisingly, it is not the default behavior for JPA to persist related entities. By default, the BillingInfo entity would not be persisted and you would not see an INSERT statement generated to persist the BillingInfo entity into the BILLING_INFO table. The key to understanding why this is not what happens in listing 9.5 lies in the @OneToOne annotation on the billing property of the User entity:

public class User {

@OneToOne(cascade=CascadeType.PERSIST)
public void setBillingInfo(BillingInfo billing) {

Notice the value of the cascade element of the @OneToOne annotation. We deferred the discussion of this element in chapter 7 so that we could discuss it in a more relevant context here.

Cascading in ORM-based persistence is similar to the idea of cascading in databases. The cascade element essentially tells the EntityManager how, or if, to propagate a given persistence operation on a particular entity into entities related to it.

By default, the cascade element is empty, which means that no persistence operations are propagated to related entities. Alternatively, the cascade element can be set to ALL, MERGE, PERSIST, REFRESH, or REMOVE. Table 9.3 lists the effect of each of these values.

Table 9.3. Effects of various cascade type values.

CascadeType Value

Effect

CascadeType.MERGE

Only EntityManager.merge operations are propagated to related entities.

CascadeType.PERSIST

Only EntityManager.persist operations are propagated to related entities.

CascadeType.REFRESH

Only EntityManager.refresh operations are propagated to related entities.

CascadeType.REMOVE

Only EntityManager.remove operations are propagated to related entities.

CascadeType.ALL

All EntityManager operations are propagated to related entities.

Since in our case we have set the cascade element to PERSIST, when we persist the User entity, the EntityManager figures out that a BillingInfo entity is associated with the User entity and it must be persisted as well. As table 9.3 indicates, the persist operation would still be propagated to BillingInfo if the cascade element were set to ALL instead. However, if the element were set to any other value or not specified, the operation would not be propagated and we would have to perform the persist operation on the BillingInfo entity separately from the User entity. For example, let’s assume that the cascade element on the @OneToOne annotation is not specified. To make sure both related entities are persisted, we’d have to change the addUser method as shown in listing 9.6.

Listing 9.6. Manually persisting relationships

As you can see, the BillingInfo entity is persisted first. The persisted BillingInfo entity is then set as a field of the User entity. When the User entity is persisted, the generated key from the BillingInfo entity is used in the foreign key for the generated INSERT statement.

Having explored the persist operation, let’s now move on to the next operation in the EntityManager CRUD sequence—retrieving entities.

9.3.2. Retrieving entities by primary key

JPA supports several ways to retrieve entity instances from the database. By far the simplest way is retrieving an entity by its primary key using the find method we introduced in listing 9.1. The other ways all involve using the query API and JPQL, which we’ll discuss in chapter 10. Recall that the find method was used in the addItem method in listing 9.1 to retrieve the Seller instance corresponding to the Item to add:

Seller seller = entityManager.find(Seller.class, sellerId);

The first parameter of the find method specifies the Java type of the entity to be retrieved. The second parameter specifies the identity value for the entity instance to retrieve. Recall from chapter 7 that an entity identity can either be a simple Java type identified by the @Id annotation or a composite primary key class specified through the @EmbeddedId or @IdClass annotation. In the example in listing 9.1, the find method is passed a simple java.lang.Long value matching the Seller entity’s @Id annotated identity, sellerId.

Although this is not the case in listing 9.1, the find method is fully capable of supporting composite primary keys. To see how this code might look, assume for the sake of illustration that the identity of the Seller entity consists of the seller’s first and last name instead of a simple numeric identifier. This identity is encapsulated in a composite primary key class annotated with the @IdClass annotation. Listing 9.7 shows how this identity class can be populated and passed to the find method.

Listing 9.7. Find by primary key using composite keys
SellerPK sellerKey = new SellerPK();

sellerKey.setFirstName(firstName);
sellerKey.setLastName(lastName);

Seller seller = entityManager.find(Seller.class, sellerKey);

The find method does what it does by inspecting the details of the entity class passed in as the first parameter and generating a SELECT statement to retrieve the entity data. This generated SELECT statement is populated with the primary key values specified in the second parameter of the find method. For example, the find method in listing 9.1 could generate a SELECT statement that looks something like this:

SELECT * FROM SELLERS WHERE seller_id = 1

Note that if an entity instance matching the specified key does not exist in the database, the find method will not throw any exceptions. Instead, the EntityManager will return null or an empty entity and your application must handle this situation. It is not strictly necessary to call the find method in a transactional context. However, the retrieved entity is detached unless a transaction context is available, so it is generally advisable to call the find method inside a transaction. One of the most important features of the find method is that it utilizes EntityManager caching. If your persistence provider supports caching and the entity already exists in the cache, then the EntityManager returns a cached instance of the entity instead of retrieving it from the database. Most persistence providers like Hibernate and Oracle TopLink support caching, so you can more or less count on this extremely valuable optimization.

There is one more important JPA feature geared toward application optimization—lazy and eager loading. The generated SELECT statement in our example attempts to retrieve all of the entity field data when the find method is invoked. In general, this is exactly what will happen for entity retrieval since it is the default behavior for JPA. However, in some cases, this is not desirable behavior. Fetch modes allow us to change this behavior to optimize application performance when needed.

Entity fetch modes

We briefly mentioned fetch modes in previous chapters but haven’t discussed them in great detail. Discussing entity retrieval is an ideal place to fully explore fetch modes.

As we suggested, the EntityManager normally loads all entity instance data when an entity is retrieved from the database. In ORM-speak, this is called eager fetching, or eager loading. If you have ever dealt with application performance problems due to premature or inappropriate caching, you probably already know that eager fetching is not always a good thing. The classic example we used in previous chapters is loading large binary objects (BLOBs), such as pictures. Unless you are developing a heavily graphics-oriented program such as an online photo album, it is unlikely that loading a picture as part of an entity used in a lot of places in the application is a good idea. Because loading BLOB data typically involves long-running, I/O-heavy operations, they should be loaded cautiously and only as needed. In general, this optimization strategy is called lazy fetching.

JPA has more than one mechanism to support lazy fetching. Specifying column fetch-mode using the @Basic annotation is the easiest one to understand. For example, we can set the fetch mode for the picture property on the Item entity to be lazy as follows:

@Column(name="PICTURE")
@Lob
@Basic(fetch=FetchType.LAZY)
public byte[] getPicture() {
return picture;
}

A SELECT statement generated by the find method to retrieve Item entities would not load data from the ITEMS.PICTURE column into the picture field. Instead, the picture data will be automatically loaded from the database when the property is first accessed through the getPicture method.

Be advised, however, that lazy fetching is a double-edged sword. Specifying that a column be lazily fetched means that the EntityManager will issue an additional SELECT statement just to retrieve the picture data when the lazily loaded field is first accessed. In the extreme case, imagine what would happen if all entity data in an application is lazily loaded. This would mean that the database would be flooded with a large number of frivolous SELECT statements as entity data is accessed. Also, lazy fetching is an optional EJB 3 feature, which means not every persistence provider is guaranteed to implement it. You should check your provider’s documentation before spending too much time figuring out which entity columns should be lazily fetched.

Loading related entities

One of the most intricate uses of fetch modes is to control the retrieval of related entities. Not too surprisingly, the EntityManager’s find method must retrieve all entities related to the one returned by the method. Let’s take the ActionBazaar Item entity, an exceptionally good case because it has a many-to-one, a one-to-many, and two many-to-many relationships. The only relationship type not represented is one-to-one. The Item entity has a many-to-one relationship with the Seller entity (a seller can sell more than one Item, but an item can be sold by only one seller), a one-to-many relationship with the Bid entity (more than one bid can be put on an item), and a many-to-many relationship with the Category entity (an item can belong to more than one category and a category contains multiple items). These relationships are depicted in figure 9.7.

Figure 9.7. The Item entity is related to three other entities: Seller, Bid, and Category. The relationships to Item are many-to-one, one-to-many, and many-to-many, respectively.

When the find method returns an instance of an Item, it also automatically retrieves the Seller, Bid, and Category entities associated with the instance and populates them into their respective Item entity properties. As we see in listing 9.8, the single Seller entity associated with the Item is populated into the seller property, the Bid entities associated with an Item are populated into the bids List, and the Category entities the Item is listed under are populated into the categories property. It might surprise you to know some of these relationships are retrieved lazily.

All the relationship annotations we saw in chapter 8, including the @ManyToOne, @OneToMany, and @ManyToMany annotations, have a fetch element to control fetch modes just like the @Basic annotation discussed in the previous section. None of the relationship annotations in listing 9.8 specify the fetch element, so the default for each annotation takes effect.

Listing 9.8. Relationships in the Item entity

By default, some of the relationship types are retrieved lazily while some are loaded eagerly. We’ll discuss why each default makes sense as we go through each relationship retrieval case for the Item entity. The Seller associated with an Item is retrieved eagerly, because the fetch mode for the @ManyToOne annotation is defaulted to EAGER. To understand why this make sense, it is helpful to understand how the EntityManager implements eager fetching. In effect, each eagerly fetched relationship turns into an additional JOIN tacked onto the basic SELECT statement to retrieve the entity. To see what we mean, let’s see how the SELECT statement for an eagerly fetched Seller record related to an Item looks (listing 9.9).

Listing 9.9. SELECT statement for eagerly fetched Seller related to an Item

As listing 9.9 shows, an eager fetch means that the most natural way of retrieving the Item entity would be through a single SELECT statement using a JOIN between the ITEMS and SELLERS tables. It is important to note the fact that the JOIN will result in a single row, containing columns from both the SELLERS and ITEMS tables. In terms of database performance, this is more efficient than issuing one SELECT to retrieve the Item and issuing a separate SELECT to retrieve the related Seller. This is exactly what would have happened in case of lazy fetching and the second SELECT for retrieving the Seller will be issued when the Item’s seller property is first accessed. Pretty much the same thing applies to the @OneToOne annotation, so the default for it is also eager loading. More specifically, the JOIN to implement the relationship would result in a fairly efficient single combined row in all cases.

Lazy vs. eager loading of related entities

In contrast, the @OneToMany and @ManyToMany annotations are defaulted to lazy loading. The critical difference is that for both of these relationship types, more than one entity is matched to the retrieved entity. Think about Category entities related to a retrieved Item, for example. JOINs implementing eagerly loaded one-to-many and many-to-many relationships usually return more than one row. In particular, a row is returned for every related entity matched.

The problem becomes particularly obvious when you consider what happens when multiple Item entities are retrieved at one time (for example, as the result of a JPQL query, discussed in the next chapter). (N1 + N2 + ... + Nx) rows would be returned, where Ni is the number of related Category entities for the ithItem record. For nontrivial numbers of N and i, the retrieved result set could be quite large, potentially causing significant database performance issues. This is why JPA makes the conservative assumption of defaulting to lazy loading for @OneToMany and @ManyToMany annotations.

Table 9.4 lists the default fetch behavior for each type of relationship annotation.

Table 9.4. Behavior of loading of associated entity is different for each kind of association by default. We can change the loading behavior by specifying the fetch element with the relationship.

Relationship Type

Default Fetch Behavior

Number of Entities Retrieved

One-to-one

EAGER

Single entity retrieved

One-to-many

LAZY

Collection of entities retrieved

Many-to-one

EAGER

Single entity retrieved

Many-to-many

LAZY

Collection of entities retrieved

The relationship defaults are not right for all circumstances, however. While the eager fetching strategy makes sense for one-to-one and many-to-one relationships under most circumstances, it is a bad idea in some cases. For example, if an entity contains a large number of one-to-one and many-to-one relationships, eagerly loading all of them would result in a large number of JOINs chained together. Executing a relatively large number of JOINs can be just as bad as loading an (N1 + N2 + ... + Nx) result set. If this proves to be a performance problem, some of the relationships should be loaded lazily. Here’s an example of explicitly specifying the fetch mode for a relationship (it happens to be the familiar Seller property of the Item entity):

@ManyToOne(fetch=FetchType.LAZY)
public Seller getSeller() {
return seller;
}

You should not take the default lazy loading strategy of the @OneToMany and @ManyToMany annotations for granted. For particularly large data sets, this can result in a huge number of SELECTs being generated against the database. This is known as the N+1 problem, where 1 stands for the SELECT statement for the originating entity and N stands for the SELECT statement that retrieves each related entity. In some cases, you might discover that you are better off using eager loading even for @OneToMany and @ManyToMany annotations. In chapters 10 and 13, we’ll discuss how you can eagerly load related entities using JPQL query without having to change the fetch mode on an association on a per-query basis.

Unfortunately, the choice of fetch modes is not cut and dried, and depends on a whole host of factors, including the database vendor’s optimization strategy, database design, data volume, and application usage patterns. In the real world, ultimately these choices are often made through trial and error. Luckily, with JPA, performance tuning just means a few configuration changes here and there as opposed to time-consuming code modifications.

Having discussed entity retrieval, we can now move into the third operation of the CRUD sequence: updating entities.

9.3.3. Updating entities

Recall that the EntityManager makes sure that changes made to attached entities are always saved into the database behind the scenes. This means that, for the most part, our application does not need to worry about manually calling any methods to update the entity. This is perhaps the most elegant feature of ORM-based persistence since this hides data synchronization behind the scenes and truly allows entities to behave like POJOs. Take the code in listing 9.10, which calculates an ActionBazaar Power Seller’s creditworthiness.

Listing 9.10. Transparent management of attached entities

Other than looking up the PowerSeller entity, little else is done using the EntityManager in the calculateCreditWorthiness method. As you know, this is because the PowerSeller is managed by the EntityManager as soon as it is returned by the find method. Throughout the relatively long calculation for determining creditworthiness, the EntityManager will make sure that the changes to the entity wind up in the database.

Detachment and merge operations

Although managed entities are extremely useful, the problem is that it is difficult to keep entities attached at all times. Often the problem is that the entities will need to be detached and serialized at the web tier, where the entity is changed, outside the scope of the EntityManager. In addition, recall that stateless session beans cannot guarantee that calls from the same client will be serviced by the same bean instance. This means that there is no guarantee an entity will be handled by the same EntityManager instance across method calls, thus making automated persistence ineffective.

This is exactly the model that the ItemManager session bean introduced in listing 9.1 assumes. The EntityManager used for the bean has TRANSACTION scope. Since the bean uses CMT, entities become detached when transactions end the method. This means that entities returned by the session bean to its clients are always detached, just like the newly created Item entity returned by the ItemManager’s addItem method:

public Item addItem(String title, String description,
byte[] picture, double initialPrice, long sellerId) {
Item item = new Item();
item.setTitle(title);
...
entityManager.persist(item);

return item;
}

At some point we’ll want to reattach the entity to a persistence context to synchronize it with the database. The EntityManager’s merge method is designed to do just that (see figure 9.8).

Figure 9.8. An entity instance can be detached and serialized to a separate tier where the client makes changes to the entity and sends it back to the server. The server can use a merge operation to attach the entity to the persistence context.

You should remember that like all attached entities, the entity passed to the merge method is not necessarily synchronized with the database immediately, but it is guaranteed to be synchronized with the database sooner or later. We use the merge method in the ItemManager bean in the most obvious way possible, to update the database with an existing Item:

public Item updateItem(Item item) {
entityManager.merge(item);
return item;
}

As soon as the updateItem method returns, the database is updated with the data from the Item entity. The merge method must only be used for an entity that exists in the database. An attempt to merge a nonexistent entity will result in an IllegalArgumentException. The same is true if the EntityManager detects that the entity you are trying to merge has already been deleted through the remove method, even if the DELETE statement has not been issued yet.

Merging relationships

By default, entities associated with the entity being merged are not merged as well. For example, the Seller, Bid and Category entities related to the Item are not merged when the Item is merged in the previous code snippet. However, as mentioned in section 9.3.1, this behavior can be controlled using the cascade element of the @OneToOne, @OneToMany, @ManyToOne, and @ManyToMany annotations. If the element is set to either ALL or MERGE, the related entities are merged. For example, the following code will cause the Seller entity related to the Item to be merged since the cascade element is set to MERGE:

public class Item {
@ManyToOne(cascade=CascadeType.MERGE)
public Seller getSeller() {

Note that as in most of the EntityManager’s methods, the merge method must be called from a transactional context or it will throw a TransactionRequiredException.

We’ll now move on to the final element of the CRUD sequence: deleting an entity.


Detached entities and the DTO anti-pattern

If you have spent even a moderate amount of time using EJB 2, you are probably thoroughly familiar with the Data Transfer Object (DTO) anti-pattern. In a sense, the DTO anti-pattern was necessary because of entity beans. The fact that EJB 3 detached entities are nothing but POJOs makes the DTO anti-pattern less of a necessity of life. Instead of having to create separate DTOs from domain data just to pass back and forth between the business and presentation layers, you may simply pass detached entities. This is exactly the model we follow in this chapter.

However, if your entities contain behavior, you might be better off using the DTO pattern anyway, to safeguard business logic from inappropriate usage outside a transactional context. In any case, if you decide to use detached entities as a substitute to DTOs, you should make sure they are marked java.io.Serializable.


9.3.4. Deleting entities

The deleteItem method in the ItemManagerBean in listing 9.1 deletes an Item from the database. An important detail to notice about the deleteItem method (repeated next) is that the item to be deleted was first attached to the EntityManager using the merge method:

public void deleteItem(Item item) {
entityManager.remove(entityManager.merge(item));
}

This is because the remove method can only delete currently attached entities and the Item entity being passed to the deleteItem method is not managed. If a detached entity is passed to the remove method, it throws an IllegalArgumentException. Before the deleteItem method returns, the Item record will be deleted from the database using a DELETE statement like this:

DELETE FROM ITEMS WHERE item_id = 1

Just as with the persist and merge methods, the DELETE statement is not necessarily issued immediately but is guaranteed to be issued at some point. Meanwhile, the EntityManager marks the entity as removed so that no further changes to it are synchronized (as we noted in the previous section).

Cascading remove operations

Just as with merging and persisting entities, you must set the cascade element of a relationship annotation to either ALL or REMOVE for related entities to be removed with the one passed to the remove method. For example, we can specify that the BillingInfo entity related to a Bidder be removed with the owning Bidder entity as follows:

@Entity
public class Bidder {
@OneToOne(cascade=CascadeType.REMOVE)
public BillingInfo setBillingInfo() {

From a common usage perspective, this setup makes perfect sense. There is no reason for a BillingInfo entity to hang around if the enclosing Bidder entity it is related to is removed. When it comes down to it, the business domain determines if deletes should be cascaded. In general, you might find that the only relationship types where cascading removal makes sense are one-to-one and one-to-many. You should be careful when using the cascade delete because the related entity you are cascading the delete to may be related to other entities you don’t know about. For example, let’s assume that there is a one-to-many relationship between the Seller and Item entities and you are using cascade delete to remove a Seller and its related Items. Remember the fact that other entities such as the Category entity also hold references to the Items you are deleting and those relationships would become meaningless!

Handling relationships

If your intent was really to cascade delete the Items associated with the Seller, you should iterate over all instances of Category that reference the deleted Items and remove the relationships first. The following code does this:

List<Category> categories = getAllCategories();
List<Item> items = seller.getItems();
for (Item item: items) {
for (Category category: categories) {
category.getItems().remove(item);
}
}
entityManager.remove(seller);

The code gets all instances of Category in the system and makes sure that all Items related to the Seller being deleted are removed from referencing Lists first. It then proceeds with removing the Seller, cascading the remove to the related Items.

Not surprisingly, the remove method must be called from a transactional context or it will throw a TransactionRequiredException. Also, trying to remove an already removed entity will raise IllegalStateException.

Having finished the basic Entity-Manager CRUD operations, let’s now move on to the two remaining major persistence operations: flushing data to the database and refreshing from the database.

9.3.5. Controlling updates with flush

We’ve been talking about EntityManager flushing on and off throughout the chapter. It is time we discussed this concept fully. For the most part, you’ll probably be able to get away without knowing too much about this EntityManager feature. However, there are some situations where not understanding EntityManager flushing could be a great disadvantage. Recall that EntityManager operations like persist, merge, and remove do not cause immediate database changes. Instead, these operations are postponed until the EntityManager is flushed. The true motivation for doing things this way is performance optimization. Batching SQL as much as possible instead of flooding the database with a bunch of frivolous requests saves a lot of communication overhead and avoids unnecessarily tying down the database.

By default, the database flush mode is set to AUTO. This means that the EntityManager performs a flush operation automatically as needed. In general, this occurs at the end of a transaction for transaction-scoped EntityManagers and when the persistence context is closed for application-managed or extended-scope EntityManagers. In addition, if entities with pending changes are used in a query, the persistence provider will flush changes to the database before executing the query.

You can set the flush mode to COMMIT if you don’t like the idea of autoflushing and want greater control over database synchronization. You can do so using the EntityManager’s setFlushMode method as follows:

entityManager.setFlushMode(FlushModeType.COMMIT);

If the flush mode is set to COMMIT, the persistence provider will only synchronize with the database when the transaction commits. However, you should be careful with this, as it will be your responsibility to synchronize entity state with the database before executing a query. If you don’t do this and an EntityManager query returns stale entities from the database, the application can wind up in an inconsistent state.

In reality, resetting flush mode is often overkill. This is because you can explicitly flush the EntityManager when you need to, using the flush method as follows:

entityManager.flush();

The EntityManager synchronizes the state of every managed entity with the database as soon as this method is invoked. Like everything else, manual flushing should be used in moderation and only when needed. In general, batching database synchronization requests is a good optimization strategy to try to preserve.

We’ll now move on to the last persistence operation to be discussed in this chapter: refreshing entities.

9.3.6. Refreshing entities

The refresh operation repopulates entity data from the database. In other words, given an entity instance, the persistence provider matches the entity with a record in the database and resets the entity with retrieved data from the database as shown in figure 9.9.

Figure 9.9. The refresh operation repopulates the entity from the database, overriding any changes in the entity.

While this EntityManager method is not used frequently, there are some circumstances where it is extremely useful. In listing 9.1, we use the method to undo changes made by the ItemManager client and return fresh entity data from the database:

public Item undoItemChanges(Item item) {
entityManager.refresh(entityManager.merge(item));
return item;
}

The merge operation is performed first in the undoItemChanges method because the refresh method only works on managed entities. It is important to note that just like the find method, the refresh method uses the entity identity to match database records. As a result, you must make sure the entity being refreshed exists in the database.

The refresh method really shines when you consider a subtle but very common scenario. To illustrate this scenario, let’s go back to the addItem method in listing 9.1:

public Item addItem(String title, String description,
byte[] picture, double initialPrice, long sellerId) {
Item item = new Item();
item.setTitle(title);
...
entityManager.persist(item);

return item;
}

Note a subtle point about the method: it assumes that the Item entity is not altered by the database in any way when the record is inserted into the database. It is easy to forget that this is seldom the case with relational databases. For most INSERT statements issued by the usual application, the database will fill in column values not included in the INSERT statement using table defaults. For example, assume that the Item entity has a postingDate property that is not populated by the application. Instead, this value is set to the current database system time when the ITEMS table record is inserted. This could be implemented in the database by utilizing default column values or even database triggers.

Since the persist method only issues the INSERT statement and does not load the data that was changed by the database as a result of the INSERT, the entity returned by the method would not include the generated postingDate field. This problem could be fixed by using the refresh method as follows:

public Item addItem(String title, String description,
byte[] picture, double initialPrice, long sellerId) {
Item item = new Item();
item.setTitle(title);
...
entityManager.persist(item);
entityManager.flush();

entityManager.refresh(item);

return item;
}

After the persist method is invoked, the EntityManager is flushed immediately so that the INSERT statement is executed and the generated values are set by the database. The entity is then refreshed so that we get the most up-to-date data from the database and populate it into the inserted Item instance (including the postingDate field). In most cases you should try to avoid using default or generated values with JPA due to the slightly awkward nature of the code just introduced. Luckily, this awkward code is not necessary while using fields that use the JPA @GeneratedValue annotation since the persist method correctly handles such fields.

Before we wrap up this chapter, we’ll introduce entity lifecycle-based listeners.

9.4. Entity lifecycle listeners

You saw in earlier chapters that both session and message-driven beans allow you to listen for lifecycle callbacks like PostConstruct and PreDestroy. Similarly, entities allow you to receive callbacks for lifecycle events like persist, load, update, and remove. Just as in session and message-driven beans, you can do almost anything you need to in the lifecycle callback methods, including invoking an EJB, or using APIs like JDBC or JMS. In the persistence realm, however, lifecycle callbacks are typically used to accomplish such tasks as logging, validating data, auditing, sending notifications after a database change, or generating data after an entity has been loaded. In a sense, callbacks are the database triggers of JPA. Table 9.5 lists the callbacks supported by the API.

Table 9.5. Callbacks supported by JPA and when they are called

Lifecycle Method

When It Is Performed

PrePersist

Before the EntityManager persists an entity instance

PostPersist

After an entity has been persisted

PostLoad

After an entity has been loaded by a query, find, or refresh operation

PreUpdate

Before a database update occurs to synchronize an entity instance

PostUpdate

After a database update occurs to synchronize an entity instance

PreRemove

Before EntityManager removes an entity

PostRemove

After an entity has been removed

Entity lifecycle methods need not be defined in the entity itself. Instead, you can choose to define a separate entity listener class to receive the lifecycle callbacks. We highly recommend this approach, because defining callback methods in the entities themselves will clutter up the domain model you might have carefully constructed. Moreover, entity callbacks typically contain crosscutting concerns rather than business logic directly pertinent to the entity. For our purposes, we’ll explore the use of entity callbacks using separate listener classes, default listeners, and the execution order of entity listeners if you have multiple listeners.

9.4.1. Using an entity listener

Let’s see how entity lifecycle callbacks look by coding an entity listener on the Item entity that notifies an ActionBazaar admin if an item’s initial bid amount is set higher than a certain threshold. It is ActionBazaar policy to scrutinize items with extremely high initial prices to check against possible fraud, especially for items such as antiques and artwork. Listing 9.11 shows the code.

Listing 9.11. ItemMonitor entity listener

As you can see in listing 9.11, our listener, ItemMonitor, has a single method, monitorItem, which receives callbacks for both the PrePersist and PreUpdate events. The @EntityListeners annotation on the Item entity specifies ItemMonitor to be the lifecycle callback listener for the Item entity. It’s worth noting that the listener callbacks can be defined on an entity class or mapped superclass. All we have to do to receive a callback is to annotate our method with a callback annotation such as @PrePersist, @PostPersist, @PreUpdate, and so on. The monitorItem method checks to see if the initial bid amount set for the item to be inserted or updated is above the threshold specified by the ItemMonitor.MONITORING_THRESHOLD variable and sends the ActionBazaar admin an e-mail alert if it is. As you might have guessed by examining listing 9.11, entity listener callback methods follow the form void <METHOD>(Object). The single method parameter of type Object specifies the entity instance for which the lifecycle event was generated. In our case, this is the Item entity.

If the lifecycle callback method throws a runtime exception, the intercepted persistence operation is aborted. This is an extremely important feature to validate persistence operations. For example, if you have a listener class to validate that all entity data is present before persisting an entity, you could abort the persistence operation if needed by throwing a runtime exception.

Listing 9.12 shows an example listener class that can be used to validate an entity state before an entity is persisted. You can build validation logic in a PrePersist callback, and if the callback fails the entity will not be persisted. For example, ActionBazaar sets a minimum price for the initialPrice for items being auctioned, and no items are allowed to be listed below that price.

Listing 9.12. ItemVerifier, which validates the price set for an item
public class ItemVerifier{

...

public ItemVerifier() {
}
@PrePersist
public void newItemVerifier(Item item){
if (item.getInitialPrice()<MIN_INITIAL_PRICE)
throw new ItemException(
"Item Price is lower than "+
" Minimum Price Allowed"); }
}

All entity listeners are stateless in nature and you cannot assume that there is a one-to-one relationship between an entity instance and a listener instance.


Note

One great drawback of entity listener classes is that they do not support dependency injection. This is due to the fact that entities may be used outside containers, where DI is not available.


For crosscutting concerns like logging or auditing, it is inconvenient to have to specify listeners for individual entities. Given this problem, JPA enables you to specify default entity listeners that receive callbacks from all entities in a persistence unit. Let’s take a look at this mechanism next.

9.4.2. Default listener classes

ActionBazaar audits all changes made to entities. You can think of this as an ActionBazaar version of a transaction log. This feature can be implemented using a default listener like the following:

public class ActionBazaarAuditor {
...
@PrePersist
@PostPersist
...
@PostRemove
public void logOperation(Object object) {
Logger.log("Performing Persistence Operation on: "
+ object.getName());

The ActionBazaarAuditor listens for all persistence operations for all entities in the ActionBazaar persistence unit, and logs the name of the entity that the callback was generated on. Unfortunately, there is no way to specify default entity listeners using annotations, and we must resort to using the persistence. xml deployment descriptor. Since we have not yet fully described the persistence deployment descriptor, we’ll simply note the relevant descriptor snippet here, leaving a detailed analysis to chapter 11:

<persistence-unit name="actionBazaar">
...
<default-entity-listeners>
actionbazaar.persistence.ActionBazaarAuditor.class
</default-entity-listeners>
...

In this snippet, the default-entity-listeners element lists the default entity listeners for the actionBazaar persistence unit. Again, do not worry too much about the specific syntax at the moment, as we’ll cover it in greater detail later.

This brings us to the interesting question of what happens if there is both a default listener and an entity-specific listener for a given entity, as in the case of our Item entity. The Item entity now has two lifecycle listeners: the default ActionBazaarAuditor listener and the ItemMonitor listener. How do you think they interact? Moreover, since entities are POJOs that can inherit from other entities, both the superclass and subclass may have attached listeners. For example, what if the User entity has an attached listener named UserMonitor and the Seller subclass also has an attached listener, SellerMonitor? How these listeners relate to each other is determined by the order of execution as well as exclusion rules.

9.4.3. Listener class execution order and exclusion

If an entity has default listeners, entity class–specific listeners, and inherited superclass listeners, the default listeners are executed first. Following OO constructor inheritance rules, the superclass listeners are invoked after the default listeners. Subclass listeners are invoked last. Figure 9.10 depicts this execution order.

Figure 9.10. Entity listener execution order. Default entity listeners are executed first, then superclass and subclass listeners.

If more than one listener is listed on any level, the execution order is determined by the order in which they are listed in the annotation or deployment descriptor. For example, in the following entity listener annotation, the ItemMonitor listener is called before ItemMonitor2:

EntityListeners({actionbazaar.persistence.
ItemMonitor.class, actionbazaar.persistence.
ItemMonitor2.class})

You cannot programmatically control this order of execution. However, if needed, you can exclude default and superclass listeners from being executed at all. Let’s assume that we need to disable both default and superclass listeners for the Seller entity. You can do this with the following code:

@Entity
@ExcludeDefaultListeners
@ExcludeSuperClassListeners
@EntityListeners(actionbazaar.persistence.SellerMonitor.class)
public class Seller extends User {

As you can see from the code, the @ExcludeDefaultListeners annotation disables any default listeners while the @ExcludeSuperClassListeners annotation stops superclass listeners from being executed. As a result, only the SellerMonitor listener specified by the @EntityListeners annotation will receive lifecycle callbacks for the Seller entity. Unfortunately, neither the @ExcludeDefaultListeners nor the @ExcludeSuperClassListeners annotation currently enables us to block specific listener classes. We hope that this is a feature that will be added in a future version of JPA.

9.5. Entity operations best practices

Throughput this chapter, we have provided you with some hints on the best practices of using the EntityManager interface. Before we conclude, let’s solidify the discussion of best practices by exploring a few of the most important ones in detail.

Use container-managed entity managers. If you’re building an enterprise application that will be deployed to a Java EE container, we strongly recommend that you use container-managed entity managers. Furthermore, if you are manipulating entities from the session bean and MDB tier, you should use declarative transactions in conjunction with container-managed EntityManagers. Overall, this will let you focus on application logic instead of worrying about the mundane details of managing transactions, managing EntityManager lifecycles, and so on.

Avoid injecting entity managers into the web tier. If you’re using the EntityManager API directly from a component in the web tier, we recommend that you avoid injecting entity managers because the EntityManager interface is not thread-safe. Instead, you should use a JNDI lookup to grab an instance of a container-managed EntityManager. Better yet, use the session Façade pattern discussed in chapter 12 instead of using the EntityManager API directly from the web tier and take advantage of the benefits offered through session beans such as declarative transaction management.

Use the Entity Access Object pattern. Instead of cluttering your business logic with EntityManager API calls, use a data access object (we call it entity access object) discussed in chapter 12. This practice allows you to abstract the EntityManager API from the business tier.

Separate callbacks into external listeners. Don’t pollute your domain model with crosscutting concerns such as auditing and logging code. Use external entity listener classes instead. This way, you could swap listeners in and out as needed.

9.6. Summary

In this chapter, we covered the most vital aspect of JPA: persistence operations using entity managers. We also explored persistence contexts, persistence scopes, various types of entity managers, and their usage patterns. We even briefly covered entity lifecycles and listeners. We highly recommend Java EE container-managed persistence contexts with CMT-driven transactions. In general, this strategy minimizes the amount of careful consideration of what is going on behind the scenes and consequent meticulous coding you might have to engage in otherwise. How ever, there are some valid reasons for using application-managed EntityManagers. In particular, the ability to use the EntityManager outside the confines of a Java EE container is an extremely valuable feature, especially to those of us with lightweight technologies like Apache Tomcat and the Spring Framework or even Java SE Swing-based client/server applications.

We avoided covering a few relatively obscure EntityManager features like lazily obtaining entity references via the getReference method, or using the clear method to force detachment of all entities in a persistence context. We encourage you to research these remaining features on your own. However, a critical feature that we did not discuss yet is robust entity querying using the powerful query API and JPQL. We’ll examine this in detail in the next chapter.