Chapter 8. Object-relational mapping – EJB 3 in Action

Chapter 8. Object-relational mapping

This chapter covers

  • Object-relational mapping concepts
  • Mapping entities to tables
  • Relationship mapping
  • Inheritance mapping strategies

In the previous chapter, we used EJB 3 JPA features to create a POJO domain model that supported a full range of OO features, including inheritance. We discussed entities, embedded objects, and the relationships between them using EJB 3 annotations. In this chapter, you’ll learn how to persist our domain model into a relational database using object-relational mapping (ORM), which is the basis for JPA. In effect, ORM specifies how sets of Java objects, including references between them, are mapped to rows and columns in database tables. The first part of this chapter briefly discusses the difference between the object-oriented and relational world, also known as “impedance mismatch.” Later sections of the chapter explore the ORM features of the EJB 3 JPA.

If you are a seasoned enterprise developer, you are probably comfortable with relational databases. If this is not the case, then refer to appendix B for a primer on some relatively obscure relational database concepts such as normalization and sequence columns that you must grasp to have a clear understanding of the intricacies of O/R mapping.

We start our discussion by taking a look at the basic motivation behind O/R mapping, the so-called impedance mismatch. Then we’ll begin our analysis by mapping domain objects, and move on to mapping relations. Finally we’ll examine the concept of map inheritance and you’ll learn about the inheritance strategies supported by JPA.

8.1. The impedance mismatch

The term impedance mismatch refers to the differences in the OO and relational paradigms and difficulties in application development that arise from these differences. The persistence layer where the domain model resides is where the impedance mismatch is usually the most apparent. The root of the problem lies in the differing fundamental objectives of both technologies.

Recall that when a Java object holds a reference to another, the actual referred object is not copied over into the referring object. In other words, Java accesses objects by reference and not by value. For example, two different Item objects containing the same category instance variable value really point to the same Category object in the JVM. This fact frees us from space efficiency concerns in implementing domain models with a high degree of conceptual abstraction. If this were not the case, we’d probably store the identity of the referred Category object (perhaps in an int variable) inside the Item and materialize the link when necessary. This is in fact almost exactly what is done in the relational world.

The JVM also offers the luxury of inheritance and polymorphism (by means that are very similar to the object reference feature) that does not exist in the relational world. Lastly, as we mentioned in the previous chapter, a rich domain model object includes behavior (methods) in addition to attributes (data in instance variables). Databases tables, on the other hand, inherently encapsulate only rows, columns, and constraints, and not business logic. These differences mean that the relational and OO model of the same conceptual problem look very different, especially for an appropriately normalized database created by an experienced DBA. Table 8.1 summarizes some of the overt mismatches between the object and relational worlds.

Table 8.1. The impedance mismatch: obvious differences between the object and relational worlds

OO Model (Java)

Relational Model

Object, classes

Table, rows

Attributes, properties

Columns

Identity

Primary key

Relationship/reference to other entity

Foreign key

Inheritance/polymorphism

Not supported

Methods

Indirect parallel to SQL logic, stored procedures, triggers.

Code is portable

Not necessarily portable, depending on vendor

In the following sections, we’ll crystallize the object-relational mismatch a little more by looking at a few corner cases while saving a persistence layer domain model into the database. (As you’ll recall from chapter 2, a corner case is a problem or situation that occurs only outside normal operating parameters.) We’ll also discuss problems in mapping objects to database tables and provide a brief overview of ORM.

8.1.1. Mapping objects to databases

The most basic persistence layer for a Java application could consist of saving and retrieving domain objects using the JDBC API directly. To flush out the particularly rough spots in the object-relational mismatch, we’ll assume automated ORM does not exist and that we are following the direct JDBC route to persistence. Later we’ll see that the EJB 3 Persistence API irons out these rough spots through simple configuration. Scott Ambler has written an interesting article that discusses the problem of mapping objects to a relational database (www.agiledata.org/essays/mappingObjects.html).

One-to-one mapping

As we discussed in the previous chapter, one-to-one relationships between entities, though rare in applications, make a great deal of sense in the domain-modeling world. For example, the User and BillingInfo objects represent two logically separate concepts in the real world (we assume) that are bound by a one-to-one relationship. Moreover, we also know that it does not make very much sense for a BillingInfo object to exist without an associated User. The relationship could be unidirectional from User to BillingInfo. Figure 8.1 shows this relationship, and listing 8.1 implements it.

Figure 8.1. A unidirectional one-to-one (optional) relationship between User and BillingInfo

Listing 8.1. One-to-one relationship between User and BillingInfo

From an OO perspective, it would make sense for the database tables storing this data to mirror the Java implementation in listing 8.1. In this scheme, two different tables, USERS and BILLING_INFO, would have to be created, with the billing object reference in the User object translated into a foreign key to the BILLING_INFO table’s key in the USERS table (perhaps called BILLING_ID). The problem is that this scheme does not make complete sense in the relational world. As a matter of fact, since the objects are merely expressing a one-to-one relationship, normalization would dictate that the USERS and BILLING_INFO tables be merged into one. This would eliminate the almost pointless BILLING_INFO table and the redundant foreign key in the USERS table. The extended USERS table could look like this:

USER_ID                 NOT NULL, PRIMARY KEY  NUMBER
EMAIL NOT NULL VARCHAR2(255)
CREDIT_CARD_TYPE VARCHAR2(255)
CREDIT_CARD_NUMBER VARCHAR2(255)
NAME_ON_CREDIT_CARD VARCHAR2(255)
CREDIT_CARD_EXPIRATION DATE
BANK_ACCOUNT_NUMBER VARCHAR2(255)
BANK_NAME VARCHAR2(255)
ROUTING_NUMBER VARCHAR2(255)

In effect, our persistence layer mapping code would have to resolve this difference by pulling field data out of both the USERS and related BILLING_INFO tables and storing it into the columns of the combined USERS table. A bad approach, but an all-too-common one, would be to compromise your domain model to make it fit the relational data model (get rid of the separate BillingInfo object). While this would certainly make the mapping code simpler, you would lose out on a sensible domain model. In addition, you would write awkward code for the parts of your application that deal only with the BillingInfo object and not the User object. If you remember our discussion in chapter 7, then you probably realize that BillingInfo may make sense as an embedded object since you do not want to have a separate identity, and want to store the data in the USERS table.

One-to-many relationships

The relational primary-key/foreign-key mechanism is ideally suited for a parent-child one-to-many relationship between tables. Let’s take the probable relationship between the ITEMS and BIDS tables, for example. The tables will probably look like those shown in listing 8.2.

Listing 8.2. One-to-many relationship between ITEMS and BIDS tables

The ITEM_ID foreign key into the ITEMS table from the BIDS table means that multiple BIDS table rows can refer to the same record in the ITEMS table. This implements a many-to-one relationship going from the BIDS table to the ITEMS table, and it is simple to retrieve an item given a bid record. On the other hand, retrieval from ITEMS to BIDS will require a little more effort in looking for BIDS rows that match a given ITEM_ID key. As we mentioned in the previous chapter, however, the relationship between the Item and Bid domain objects is one-many bidirectional. This means that the Item object has a reference to a set of Bid objects while the Bid object holds a reference to an Item object. As a Java developer, you might have expected the ITEMS table to contain some kind of reference to the BIDS table in addition to the ITEM_ID foreign key in the BIDS table. The problem is that such a table structure simply does not make sense in the relational world. Instead, our ORM layer must translate the parent-child unidirectional database relationship into a bidirectional one-to-many relationship in the OO world by using a lookup scheme instead of simple, directional references.

Many-to-many

Many-to-many relationships are common in enterprise development. In our ActionBazaar domain model presented in chapter 7, the relationship between the Item and Category domain objects is many-to-many. That is, an item can belong in multiple categories while a category can contain more than one item. This is fairly easy to implement in the OO world with a set of references on either side of the relationship. In the database world, on the other hand, the only way to implement a relationship is through a foreign key, which is inherently one-to-many. As a result, the only real way to implement many-to-many relationships is by breaking them down into two one-to-many relationships. Let’s see how this works by taking a look at the database table representation of the item-category relationship in listing 8.3.

Listing 8.3. Many-to-many relationship between ITEMS and CATEGORIES tables

The CATEGORIES_ITEMS table is called an association or intersection table and accomplishes a pretty neat trick. The only two columns it contains are foreign key references to the ITEMS and CATEGORIES tables (ironically the two foreign keys combined are the primary key for the table). In effect, it makes it possible to match up arbitrary rows of the two related tables, making it possible to implement many-to-many relationships. Since neither related table contains a foreign key, relationship direction is completely irrelevant. To get to the records on the other side of the relationship from either side, we must perform a join in the ORM layer involving the association layer. For example, to get all the items under a category, we must retrieve the CATEGORY_ID, join the CATEGORIES_ITEMS table with the ITEMS table, and retrieve all item data for rows that match the CATEGORY_ID foreign key. Saving the relationship into the database would involve saving a row CATEGORIES_ITEMS table that links rows stored in CATEGORIES and ITEM tables. Clearly, the many-to-many relationships are modeled differently in the relational world than they are in the OO world.

Inheritance

Unlike the three previous cases (one-to-one, one-to-many, and many-to-many), inheritance is probably the most severe case of the object-relational mismatch. Inheritance therefore calls for solutions that are not elegant fits to relational theory at all. The OO concept of inheritance has no direct equivalent in the relational world. However, there are a few creative ways that O/R solutions bridge this gap, including:

  • Storing each object in the inheritance hierarchy in completely separated tables
  • Mapping all classes into a single table
  • Storing superclass/subclasses in related tables

Because none of these strategies is simple, we’ll save a detailed discussion for later in the chapter (section 8.4).

8.1.2. Introducing O/R mapping

In the most general sense, the term object-relational mapping means any process that can save an object (in our case a Java Object) into a relational database. As we mentioned, for all intents and purposes you could write home-brewed JDBC code to do that. In the realm of automated persistence, ORM means using primarily configuration metadata to tell an extremely high-level API which tables a set of Java Objects are going to be saved into. It involves the hopefully simple act of figuring out what table row an Object instance should be saved into and what field/property data belongs in what column. In EJB 3, the configuration metadata obviously consists either of annotations or deployment descriptor elements, or both. As our impedance mismatch discussion points out, there are a few wrinkles in the idealistic view of automated persistence. Because of the inherent complexity of the problem, EJB 3 cannot make the solution absolutely effortless, but it goes a long way in making it less painful. In the next section, we start our discussion of EJB 3 ORM by covering the simple case of saving an entity without regard to domain relations or inheritance.


ORM portability in EJB 2

One of the greatest weaknesses of EJB 2 container-managed persistence (CMP) entity beans was that EJB 2 never standardized the process of ORM. Instead, mapping strategies were left up to the individual vendors, whose approaches varied widely. As a result, porting entity beans from one application server to another more or less meant redoing O/R mapping all over again. This meant that the portability that EJB 2 promised meant little more than empty words.

EJB 3 firmly standardizes O/R mapping and gets us much closer to the goal of portability. As a matter of fact, as long as you are careful to steer clear of application server-specific features, you can likely achieve portability.


Other than smoothing out the impedance problems by applying generalized strategies behind the scenes, there are a few other benefits to using O/R. Even disregarding the edge cases discussed earlier, if you have spent any time writing application persistence layers using JDBC, you know that substantial work is required. This is largely because of the repetitive “plumbing” code of JDBC and the large volume of complicated handwritten SQL involved. As you’ll soon see, using ORM frees you from this burden and the task of persistence largely becomes an exercise in simple configuration. The fact that the EJB 3 persistence provider generates JDBC and SQL code on your behalf has another very nice effect: because the persistence provider is capable of automatically generating code optimized to your database platform from your database-neutral configuration data, switching databases becomes a snap. Accomplishing the same using handwritten SQL is tedious work at best and impossible at worst. Database portability is one of the most appealing features of the EJB 3 Java Persistence API that fits nicely with the Java philosophy, but has been elusive for some time.

Now that we’ve looked at the reasons for O/R mapping, let’s see how EJB 3 implements it.

8.2. Mapping entities

This section explores some of the fundamental features of EJB 3 O/R mapping by taking a look at the implementation of the ActionBazaar User entity. You’ll see how to use several ORM annotations such as @Table, @Column, @Enumerated, @Lob, @Temporal, and @Embeddable.


Note

If you remember from our discussion in chapter 7, User is the superclass to both the Seller and Bidder domain objects. To keep this example simple, we’ll ignore the inheritance and use a persistence field to identify the user type in the table USERS. We will use the same example to demonstrate inheritance mapping in section 8.4.


The User entity contains fields that are common to all user types in ActionBazaar, such as the user ID, username (used for login and authentication), first name, last name, user type (bidder, seller, admin, etc.), user-uploaded picture, and user account creation date. All fields are mapped and persisted into the database for the User entity in listing 8.4. We have used field-based persistence for the entity to keep the code sample short.

Listing 8.4. Mapping an entity

Briefly scanning listing 8.4, you see that the User entity is mapped to the USERS table joined with the USER_PICTURES table using the USER_ID primary key. Each of the fields is mapped to a database column using the @Column annotation . We deliberately made the listing feature-rich, and quite a few interesting things are going on with the columns. The userType field is restricted to be an ordinal enumeration . The picture field is marked as a binary large object (BLOB) that is lazily loaded . The creationDate field is marked as a temporal type . The address field is an embedded object . The column mapping for Address object is defined in the object using the @Column annotation. All in all, the @Table, @SecondaryTable, @Column, @Enumerated, @Lob, @Basic, @Temporal, @Embedded, and @Embeddable ORM annotations are used. Let’s start our analysis of O/R with the @Table annotation.


Annotations vs. XML in O/R mapping

The difficulty of choosing between annotations and XML deployment descriptors manifests itself most strikingly in the arena of EJB 3 O/R mapping. XML descriptors are verbose and hard to manage, and most developers find them to be a pain-point for Java EE. While O/R mapping with annotations makes life simpler, you should keep in mind that you are hard-coding your database schema in your code in a way similar to using JDBC. This means that the slightest schema change will result in a recompilation and redeployment cycle as opposed to simple configuration. If you have a stable database design that rarely changes or you are comfortable using JDBC data access objects (DAOs), then there is no issue here. But if you have an environment where the database schema is less stable (subject to change more often), you’re probably better off using descriptors. Luckily, you can use XML descriptors to override ORM annotations after deploying to a production environment. As a result, changing your mind in response to the reality on the ground may not be a big deal.


8.2.1. Specifying the table

@Table specifies the table containing the columns to which the entity is mapped. In listing 8.4, the @Table annotation makes the USERS table’s columns available for ORM. In fact, by default all the persistent data for the entity is mapped to the table specified by the annotation’s name parameter. As you can see from the annotation’s definition here, it contains a few other parameters:

@Target(TYPE)
@Retention(RUNTIME)
public @interface Table {
String name() default "";
String catalog() default "";
String schema() default "";
UniqueConstraint[] uniqueConstraints() default {};
}

The @Table annotation itself is optional. If it’s omitted, the entity is assumed to be mapped to a table in the default schema with the same name as the entity class. If the name parameter is omitted, the table name is assumed to be the same as the name of the entity. This will be just fine if we are mapping to the USER table.


Note

Most persistence providers include a great developer-friendly feature known as automatic schema generation. The persistence provider will automatically create database objects for your entities when they do not exist in the database. This behavior is not mandated by specification and is configured using vendor-specific properties. Most of our code examples rely on automatic schema generation. In chapter 11, we’ll provide an example configuration to enable automatic schema generation.


We won’t discuss the catalog and schema parameters in depth since they are hardly ever used. In effect, they allow you to fully qualify the mapped table. For example, we could have explicitly specified that the USERS table belongs in the ACTIONBAZAAR schema like so:

@Table(name="USERS", schema="ACTIONBAZAAR")
public class User

Note

We’ve already discussed what a schema is. For all intents and purposes, you can think of a catalog as a “meta-schema” or a higher-level abstraction for organizing schemas. Often, a database will only have one common system catalog.


By default, it is assumed that the table belongs in the schema of the data source used. You’ll learn how to specify a data source for a persistence module in chapter 11 when we discuss entity packaging. The uniqueConstraints parameter is not used that often either. It specifies unique constraints on table columns and is only used when table autocreation is enabled. Here’s an example:

@Table(name="CATEGORIES",
uniqueConstraints=
{@UniqueConstraint(columnNames={"CATEGORY_ID"})})

If it does not exist and autogeneration is enabled, the code puts a unique constraint on the CATEGORY_ID column of the CATEGORIES table when it is created during deployment time. The uniqueConstraints parameter supports specifying constraints on more than one column. It is important to keep mind, however, that EJB 3 implementations are not mandated to support generation of tables, and it is a bad idea to use automatic table generation beyond simple development databases. Most entities will typically be mapped to a single table. The User object happens to be mapped to two tables, as you might have guessed from the @SecondaryTable annotation used in listing 8.4. We’ll come back to this later after we take a look at mapping entity data using the @Column annotation.

8.2.2. Mapping the columns

The @Column annotation maps a persisted field or property to a table column. All of the fields used in listing 8.3 are annotated with @Column. For example, the userId field is mapped to the USER_ID column:

@Column(name="USER_ID")
protected Long userId;

It is assumed that the USER_ID column belongs to the USERS table specified by the @Table annotation. Most often, this is as simple as your @Column annotation will look. At best, you might need to explicitly specify which table the persisted column belongs to (when you map your entity to multiple tables using the @SecondaryTable annotation) as we do for the picture field in listing 8.4:

@Column(name="PICTURE", table="USER_PICTURES")
...
protected byte[] picture;

As you can see from the definition in listing 8.5, a number of other parameters exist for the annotation.

Listing 8.5. The @Column annotation

The insertable and updatable parameters are used to control persistence behavior. If the insertable parameter is set to false, the field or property will not be included in the INSERT statement generated by the persistence provider to create a new record corresponding to the entity. Likewise, setting the updatable parameter to false excludes the field or property from being updated when the entity is saved. These two parameters are usually helpful in dealing with read-only data, like primary keys generated by the database. They could be applied to the userId field as follows:

@Column(name="USER_ID", insertable=false, updatable=false)
protected Long userId;

When a User entity is first created in the database, the persistence provider does not include the USER_ID as part of the generated INSERT statement. Instead, we could be populating the USER_ID column through an INSERT-induced trigger on the database server side. Similarly, since it does not make much sense to update a generated key, it is not included in the UPDATE statement for the entity either. The rest of the parameters of the @Column annotation are only used for automatic table generation and specify column creation data. The nullable parameter specifies whether the column supports null values , the unique parameter indicates if the column has a unique constraint, the length parameter specifies the size of the database column, the precision parameter specifies the precision of a decimal field, and the scale parameter specifies the scale of a decimal column. Finally, the columnDefinition parameter allows you to specify the exact SQL to create the column.

We won’t cover these parameters much further than this basic information since we do not encourage automatic table creation. Note that the @Column annotation is optional. If omitted, a persistent field or property is saved to the table column matching the field or property name. For example, a property specified by the getName and setName methods will be saved into the NAME column of the table for the entity.

Next, let’s take a look at a few more annotations applied to entity data, starting with @Enumerated.

8.2.3. Using @Enumerated

Languages like C and Pascal have had enumerated data types for decades. Enumerations were finally introduced in Java 5. In case you are unfamiliar with them, we’ll start with the basics. In listing 8.4, the user type field has a type of UserType. UserType is a Java enumeration that is defined as follows:

public enum UserType {SELLER, BIDDER, CSR, ADMIN};

This effectively means that any data type defined as UserType (like our persistent field in the User object) can only have the four values listed in the enumeration. Like an array, each element of the enumeration is associated with an index called the ordinal. For example, the UserType.SELLER value has an ordinal of 0, the UserType.BIDDER value has an ordinal of 1, and so on. The problem is determining how to store the value of enumerated data into the column. The Java Persistence API supports two options through the @Enumerated annotation. In our case, we specify that the ordinal value should be saved into the database:

@Enumerated(EnumType.ORDINAL)
...
protected UserType userType;

This means that if the value of the field is set to UserType.SELLER, the value 0 will be stored into the database. Alternatively, you can specify that the enumeration value name should be stored as a String:

@Enumerated(EnumType.STRING)
...
protected UserType userType;

In this case a UserType.ADMIN value would be saved into the database as "ADMIN". By default an enumerated field or property is saved as an ordinal. This would be the case if the @Enumerated annotation is omitted altogether, or no parameter to the annotation is specified.

8.2.4. Mapping CLOBs and BLOBs

An extremely powerful feature of relational databases is the ability to store very large data as binary large object (BLOB) and character large object (CLOB) types. These correspond to the JDBC java.sql.Blob and java.sql.Clob objects. The @Lob annotation designates a property of field as a CLOB or BLOB. For example, we designate the picture field as a BLOB in listing 8.4:

@Lob
@Basic(fetch=FetchType.LAZY)
protected byte[] picture;

Whether a field or property designated @Lob is a CLOB or a BLOB is determined by its type. If the data is of type char[] or String, the persistence provider maps the data to a CLOB column. Otherwise, the column is mapped as a BLOB. An extremely useful annotation to use in conjunction with @Lob is @Basic. @Basic can be marked on any attribute with direct-to-field mapping. Just as we have done for the picture field, the @Basic(fetch=FetchType.LAZY) specification causes the BLOB or CLOB data to be loaded from the database only when it is first accessed. Postponing of loading of entity data from the database is known as lazy loading. (You will learn more about lazy loading in chapter 9.) This is a great feature since LOB data is usually very memory intensive and should only be loaded if needed. Unfortunately, lazy loading of LOB types is left as optional for vendors by the EJB 3 specification and there is no guarantee that the column will actually be lazily loaded.

8.2.5. Mapping temporal types

Most databases support a few different temporal data types with different granularity levels corresponding to DATE (storing day, month, and year), TIME (storing just time and not day, month, or year) and TIMESTAMP (storing time, day, month, and year). The @Temporal annotation specifies which of these data types we want to map a java.util.Date or java.util.Calendar persistent data type to. In listing 8.3, we save the creationDate field into the database as a DATE:

@Temporal(TemporalType.DATE)
protected Date creationDate;

Note this explicit mapping is redundant while using the java.sql.Date, java.sql.Time or java.sql.Timestamp Java types. If we do not specify a parameter for @Temporal annotation or omit it altogether, the persistence provider will assume the data type mapping to be TIMESTAMP (the smallest possible data granularity).

8.2.6. Mapping an entity to multiple tables

This is not often the case for nonlegacy databases, but sometimes an entity’s data must come from two different tables. In fact, in some rare situations, this is a very sensible strategy. For example, the User entity in listing 8.4 is stored across the USERS and USER_PICTURES tables, as shown in figure 8.2.

Figure 8.2. An entity can be mapped to more than one table; for example, the User entity spans more than one table: USERS and USER_PICTURES. The primary table is mapped using @Table and the secondary table is mapped using @SecondaryTable. The primary and secondary tables must have the same primary key.

This makes excellent sense because the USER_PICTURES table stores large binary images that could significantly slow down queries using the table. However, this approach is rarely used. Isolating the binary images into a separate table in conjunction with the lazy loading technique discussed in section 8.2.4 to deal with the picture field mapped to the USER_PICTURE table can result in a significant boost in application performance. The @SecondaryTable annotation enables us to derive entity data from more than one table and is defined as follows:

@Target({TYPE}) @Retention(RUNTIME)
public @interface SecondaryTable {
String name();
String catalog() default "";
String schema() default "";
PrimaryKeyJoinColumn[] pkJoinColumns() default {};
UniqueConstraint[] uniqueConstraints() default {};
}

Notice that other than the pkJoinColumns element, the definition of the annotation is identical to the definition of the @Table annotation. This element is the key to how annotation works. To see what we mean, examine the following code implementing the User entity mapped to two tables:

@Entity
@Table(name="USERS")
@SecondaryTable(name="USER_PICTURES",
pkJoinColumns=@PrimaryKeyJoinColumn(name="USER_ID"))
public class User implements Serializable {
..}

Obviously, the two tables in @Table and @SecondaryTable are related somehow and are joined to create the entity. This kind of relationship is implemented by creating a foreign key in the secondary table referencing the primary key in the first table. In this case, the foreign key also happens to be the primary key of the secondary table. To be precise, USER_ID is the primary key of the USER_PICTURES table and it references the primary key of the USERS table. The pkJoinColumns=@PrimaryKeyJoinColumn(name="USER_ID") specification assumes this relationship. The name element points to the USER_ID foreign key in the USER_PICTURES secondary table. The persistence provider is left to figure out what the primary key of the USERS table is, which also happens to be named USER_ID. The provider performs a join using the detected primary key in order to fetch the data for the User entity. In the extremely unlikely case that an entity consists of columns from more than two tables, we may use the @SecondaryTables annotation more than once for the same entity. We won’t cover this case here, but encourage you to explore it if needed.

Before we conclude the section on mapping entities, let’s discuss a vital feature of JPA: primary key generation.

8.2.7. Generating primary keys

When we identify a column or set of columns as the primary key, we essentially ask the database to enforce uniqueness. Primary keys that consist of business data are called natural keys. The classic example of this is SSN as the primary key for an EMPLOYEE table. CATEGORY_ID or EMPLOYEE_ID, on the other hand, are examples of surrogate keys. Essentially, surrogate keys are columns created explicitly to function as primary keys. Surrogate keys are popular and we highly recommend them, especially over compound keys.

There are three popular ways of generating primary key values: identities, sequences, and tables. Fortunately, all three strategies are supported via the @GeneratedValue annotation and switching is as easy as changing the configuration. Let’s start our analysis with the simplest case: using identities.

Identity columns as generators

Many databases such as Microsoft SQL Server support the identity column. You can use an identity constraint to manage the primary key for the User entity as follows:

@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
@Column(name="USER_ID")
protected Long userId;

This code assumes that an identity constraint exists on the USERS.USER_ID column. Note that when using IDENTITY as the generator type, the value for the identity field may not be available before the entity data is saved in the database because typically it is generated when a record is committed.

The two other strategies, SEQUENCE and TABLE, both require the use of an externally defined generator: a SequenceGenerator or TableGenerator must be created and set for the GeneratedValue. You’ll see how this works by first taking a look at the sequence generation strategy.

Database sequences as generators

To use sequence generators, first define a sequence in the database. The following is a sample sequence for the USER_ID column in an Oracle database:

CREATE SEQUENCE USER_SEQUENCE START WITH 1 INCREMENT BY 10;

We are now ready to create a sequence generator in EJB 3:

@SequenceGenerator(name="USER_SEQUENCE_GENERATOR",
sequenceName="USER_SEQUENCE",
initialValue=1, allocationSize=10)

The @SequenceGenerator annotation creates a sequence generator named USER_SEQUENCE_GENERATOR referencing the Oracle sequence we created and matching its setup. Naming the sequence is critical since it is referred to by the @GeneratedValue annotation. The initialValue element is pretty self-explanatory: allocationSize specifies by how much the sequence is incremented each time a value is generated. The default values for initialValue and allocationSize are 0 and 50, respectively. It’s handy that the sequence generator need not be created in the same entity in which it is used. As a matter of fact, any generator is shared among all entities in the persistence module and therefore each generator must be uniquely named in a persistence module. Finally, we can reimplement the generated key for the USER_ID column as follows:

@Id
@GeneratedValue(strategy=GenerationType.SEQUENCE,
generator="USER_SEQUENCE_GENERATOR")
@Column(name="USER_ID")
protected Long userId;
Sequence tables as generators

Using table generators is just as simple as with a sequence generator. The first step is creating a table to use for generating values. The table must follow a general format like the following one created for Oracle:

CREATE TABLE SEQUENCE_GENERATOR_TABLE
(SEQUENCE_NAME VARCHAR2(80) NOT NULL,
SEQUENCE_VALUE NUMBER(15) NOT NULL,
PRIMARY KEY (SEQUENCE_NAME));

The SEQUENCE_NAME column is meant to store the name of a sequence, and the SEQUENCE_VALUE column is meant to store the current value of the sequence. The next step is to prepare the table for use by inserting the initial value manually as follows:

INSERT INTO
SEQUENCE_GENERATOR_TABLE (SEQUENCE_NAME, SEQUENCE_VALUE)
VALUES ('USER_SEQUENCE', 1);

In a sense, these two steps combined are the equivalent of creating the Oracle sequence in the second strategy. Despite the obvious complexity of this approach, one advantage is that the same sequence table can be used for multiple sequences in the application. We are now prepared to create the TableGenerator utilizing the table:

@TableGenerator (name="USER_TABLE_GENERATOR",
table="SEQUENCE_GENERATOR_TABLE",
pkColumnName="SEQUENCE_NAME",
valueColumnName="SEQUENCE_VALUE",
pkColumnValue="USER_SEQUENCE")

If you need to, you can specify the values for initialValue and allocationSize as well. Finally, we can use the table generator for USER_ID key generation:

@Id
@GeneratedValue(strategy=GenerationType.TABLE,
generator="USER_TABLE_GENERATOR")
@Column(name="USER_ID")
protected Long userId;
Default primary key generation strategy

The last option for key generation is to let the provider decide the best strategy for the underlying database by using the AUTO specification as follows:

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
@Column(name="USER_ID")
protected Long userId;

This is a perfect match for automatic table creation because the underlying database objects required will be created by the JPA provider.


Note

You’d assume that if Oracle were the underlying database, the persistence provider probably would choose SEQUENCE as the strategy; if SQL Server were the underlying database, IDENTITY would likely be chosen on your behalf. However, this may not be true, and we recommend that you check the documentation of your persistence provider. For example, TopLink Essentials uses a table generator as the default autogenerator for all databases.


Keep in mind that although generated values are often used for surrogate keys, you could use the feature for any persistence field or property. Before we move on to discussing entity relations, we’ll tackle the most complicated case of mapping basic entity data next—mapping embeddable objects.


Standardization of key generation

Just as EJB 3 standardizes O/R mapping, it standardizes key generation as well. This is a substantial leap from EJB 2, where you had to resort to various sequence generator patterns to accomplish the same for CMP entity beans. Using sequences, identity constraints, or tables was a significant effort, a far cry from simple configuration, not to mention a nonportable approach.


8.2.8. Mapping embeddable classes

When discussing embeddable objects in chapter 7, we explained that an embeddable object acts primarily as a convenient data holder for entities and has no identity of its own. Rather, it shares the identity of the entity class it belongs to. Let’s now discuss how embeddable objects are mapped to the database (figure 8.3).

Figure 8.3. Embeddable objects act as convenient data holders for entities and have no identity of their own. Address is an embeddable object that is stored as a part of the User entity that is mapped to the USERS table.

In listing 8.4, we included the Address embedded object introduced in chapter 7 in the User entity as a data field. The relevant parts of listing 8.4 are repeated in listing 8.6 for easy reference.

Listing 8.6. Using embeddable objects

The User entity defines the address field as an embedded object . Notice that unlike the User entity, the embeddable Address object is missing an @Table annotation. This is because EJB 3 does not allow embedded objects to be mapped to a table different from the enclosing entity, at least not directly through the @Table annotation. Instead, the Address object’s data is stored in the USERS table that stores the enclosing entity . Therefore, the @Column mappings applied to the fields of the Address object really refer to the columns in the USERS table. For example, the street field is mapped to the USERS.STREET column, the zipCode field is mapped to the USERS.ZIP_CODE column, and so on. This also means that the Address data is stored in the same database row as the User data, and both objects share the USER_ID identity column . Other than this minor nuance, all data mapping annotations used in entities are available for you in embedded objects and behave in exactly the same manner. In general, this is the norm and embedded objects are often stored in the same table as the enclosing entity. However, if this does not suit you, it is possible to store the embeddable object’s data in a separate table and use the @SecondaryTable annotation to retrieve its data into the main entity using a join. We won’t detail this solution, as it is fairly easy to figure out and we’ll leave it for you to experiment with instead.


Sharing embeddable classes between entities

One of the most useful features of embeddable classes is that they can be shared between entities. For example, our Address object could be embedded inside a BillingInfo object to store billing addresses while still being used by the User entity. The important thing to keep in mind is that the embeddable class definition is shared in the OO world and not the actual data in the underlying relational tables. As we noted, the embedded data is created/populated using the table mapping of the entity class. However, this means that all the embeddable data must be mapped to both the USERS and BILLING_INFO tables. In other words, both tables must contain some mappable street, city, or zip columns.

An interesting wrinkle to consider is the fact that the same embedded data could be mapped to columns with different names in two separate tables. For example, the “state” data column in BILLING_INFO could be called STATE_CODE instead of STATE. Since the @Column annotation in Address maps to a column named STATE, how will this column be resolved? The solution to the answer is overriding the column mapping in the enclosing entity using the AttributeOverride annotation as follows:

@Embedded
@AttributeOverrides({@AttributeOverride(
name="state",
column=@Column(name="STATE_CODE"))})
protected Address address;

In effect, the AttributeOverride annotation is telling the provider to resolve the embedded “state” field to the STATE_CODE table for the enclosing entity.


We have now finished looking at all the annotations required for mapping entities except for mapping OO inheritance. Let’s now move on to looking at EJB 3 features for mapping entity relations.

8.3. Mapping entity relationships

In the previous chapter we explored implementing domain relationships between entities. In section 8.1.1 we briefly discussed the problems translating relationships from the OO world to the database world. In this section, you’ll see how these solutions are actually implemented in EJB 3 using annotations, starting with one-to-one relationships. We’ll explore mapping of all types of relationships: one-to-one, one-to-many, many-to-one, and many-to-many.

8.3.1. Mapping one-to-one relationships

As you know, one-to-one relationships are mapped using primary/foreign key associations. It should be pretty obvious that a parent-child relationship usually exists between the entities of a one-to-one relationship. For example, in the User-BillingInfo relationship mentioned earlier, the User entity could be characterized as the parent. Depending on where the foreign key resides, the relationship could be implemented in two different ways: using the @JoinColumn or the @PrimaryKeyJoinColumn annotation.

Using @JoinColumn

If the underlying table for the referencing entity is the one containing the foreign key to the table to which the referenced “child” entity is mapped, you can map the relationship using the @JoinColumn annotation (figure 8.4).

Figure 8.4. User has a one-to-one unidirectional relationship with BillingInfo. The User and BillingInfo entities are mapped to the USERS and BILLING_INFO tables, respectively, and the USERS table has a foreign key reference to the BILLING_INFO table. Such associations are mapped using @JoinColumn.

In our User-BillingInfo example shown in figure 8.4, the USERS table contains a foreign key named USER_BILLING_ID that refers to the BILLING_INFO table’s BILLING_ID primary key. This relationship would be mapped as shown in listing 8.7.

Listing 8.7. Mapping a one-to-one unidirectional relationship using @JoinColumn

The @JoinColumn annotation’s name element refers to the name of the foreign key in the USERS table . If this parameter is omitted, it is assumed to follow this form:

<relationship field/property name>_<name of referenced primary key column>

In our example, the foreign key name would be assumed to be BILLINGINFO_BILLING_ID in the USERS table. The referencedColumnName element specifies the name of the primary key or a unique key the foreign key refers to. If you don’t specify the referencedColumnName value, it is assumed to be the column containing the identity of the referenced entity. Incidentally, this would have been fine in our case as BILLING_ID is the primary key for the BILLING_INFO table .

Like the @Column annotation, the @JoinColumn annotation contains the updatable, insertable, table, and unique elements. The elements serve the same purpose as the elements of the @Column annotation. In our case, updatable is set to false, which means that the persistence provider would not update the foreign key even if the billingInfo reference were changed. If you have more than one column in the foreign key, you can use the JoinColumns annotation instead. We won’t cover this annotation since this situation is very unlikely, if not bad design.

If you have a bidirectional one-to-one relationship, then the entity in the inverse side of the relationship will contain the mappedBy element, as we discussed in chapter 7. If the User and BillingInfo entities have a bidirectional relationship, we must modify the BillingInfo entity to have the one-to-one relationship definition pointing to the User entity as follows:

@Entity
public class BillingInfo {
@OneToOne(mappedBy="billingInfo")
protected User user;
..

}

As you can see, the mappedBy element identifies the name of the association field in the owning side of the relationship. In a bidirectional relationship, the owning side is the entity that stores the relationship in its underlying table. In our example, the USERS table stores the relationship in the USER_BILLING_ID field and thus is the relationship owner. The one-to-one relationship in BillingInfo has the mappedBy element specified as billingInfo, which is the relationship field defined in the User entity that contains the definition for @JoinColumn.

Note that you do not have to define @JoinColumn in the entities of both sides of one-to-one relationships.

Next we’ll discuss how you define the one-to-one relationship when the foreign key is in the table to which the child entity is mapped.

Using @PrimaryKeyJoinColumn

In the more likely case that the foreign key reference exists in the table to which the referenced entity is mapped, the @PrimaryKeyJoinColumn would be used instead (figure 8.5).

Figure 8.5. User has a one-to-one unidirectional relationship with BillingInfo. The User and BillingInfo entities are mapped to the USERS and BILLING_INFO tables, respectively, and the BILLING_INFO and USERS tables share the same primary key; the primary key of the BILLING_INFO table is also a foreign key referencing the primary key of the USERS table. Such associations are mapped using @PrimaryKeyJoinColumn.

Typically, @PrimaryKeyJoinColumn is used in one-to-one relationships when both the referenced and referencing tables share the primary key of the referencing table. In our example, as shown in figure 8.5, the BILLING_INFO table would contain a foreign key reference named BILLING_USER_ID pointing to the USER_ID primary key of the USERS table. In addition, BILLING_USER_ID would be the primary key of the BILLING_INFO table. The relationship would be implemented as shown in listing 8.8.

Listing 8.8. Mapping a one-to-one relationship using @PrimaryKeyJoinColumn

The @PrimaryKeyJoinColumn annotation’s name element refers to the primary key column of the table storing the current entity. On the other hand, the referencedColumnName element refers to the foreign key in the table holding the referenced entity. In our case, the foreign key is the BILLING_INFO table’s BILLING_USER_ID column, and it points to the USERS.USER_ID primary key. If the names of both the primary key and foreign key columns are the same, you may omit the referencedColumnName element since this is what the JPA provider will assume by default. In our example, if we rename the foreign key in the BILLING_INFO table from BILLING_USER_ID to USER_ID to match the name of the primary key in the USERS table, we may omit the referencedColumnName value so that the provider can default it correctly.

If you have a composite primary key in the parent table (which is rare if you are using surrogate keys), you should use the @PrimaryKeyJoinColumns annotation instead. We encourage you to explore this annotation on your own.

You’ll learn how to map one-to-many and many-to-one relationships next.

8.3.2. One-to-many and many-to-one

As we mentioned in the previous chapter, one-to-many and many-to-one relationships are the most common in enterprise systems and are implemented using the @OneToMany and @ManyToOne annotations. For example, the Item-Bid relationship in the ActionBazaar system is one-to-many, since an Item holds references to a collection of Bids placed on it and a Bid holds a reference to the Item it was placed on. The beauty of EJB 3 persistence mapping is that the same two annotations we used for mapping one-to-one relationships are also used for one-to-many relationships. This is because both relation types are implemented as a primary-key/foreign-key association in the underlying database. Let’s see how to do this by implementing the Item-Bid relationship shown in listing 8.9.

Listing 8.9. One-to-many bidirectional relationship mapping

Since multiple instances of BIDS records would refer to the same record in the ITEMS table, the BIDS table will hold a foreign key reference to the primary key of the ITEMS table. In our example, this foreign key is BID_ITEM_ID, and it refers to the ITEM_ID primary key of the ITEMS table. This database relationship between the tables is shown in figure 8.6. In listing 8.9 the many-to-one relationship is expressed in ORM using @JoinColumn annotations. In effect, a @JoinColumn annotation’s job is to specify a primary/foreign key relationship in the underlying data model. Note that the exact @JoinColumn specification could have been repeated for both the Bid.item and Item.bids persistent fields on either side of the relationship. In @ManyToOne , the name element specifies the foreign key, BID_ITEM_ID, and the referencedColumnName element specifies the primary key, ITEM_ID. From the Item entity’s perspective, this means the persistence provider would figure out what Bid instances to put in the bids set by retrieving the matching BID_ITEM_ID in the BIDS table.

Figure 8.6. The one-to-many relationship between the Item and Bid entities is formed through a foreign key in the BIDS table referring to the primary key of the ITEMS table.

After performing the join, the JPA provider will see what BIDS records are retrieved by the join and populate them into the bids set. Similarly, when it is time to form the item reference in the Bid entity, the persistence provider would populate the @JoinColumn-defined join with the available BID_ITEM_ID foreign key, retrieve the matched ITEMS record, and put it into the item field.

Instead of repeating the same @JoinColumn annotation, we have used the mappedBy element we mentioned but did not describe in chapter 7.


Note

The persistence provider will generate deployment-time errors if you specify @JoinColumn on both sides of a bidirectional one-to-many relationship.


You can specify this element in the OneToMany element on the Item.bids variable as follows:

public class Item {
...
@OneToMany(mappedBy="item")
protected Set<Bid> bids;
...
}

The mappedBy element is essentially pointing to the previous relationship field Bid.item with the @JoinColumn definition. In a bidirectional one-to-many relationship, the owner of the relationship is the entity side that stores the foreign key—that is, the many side of the relationship.

The persistence provider will know to look it up appropriately when resolving Bid entities. In general, you have to use the mappedBy element wherever a bidirectional relationship is required. If you do not specify the mappedBy element with the @OneToOne annotation, the persistence provider will treat the relationship as a unidirectional relationship. Obviously, this would not be possible if the one-to-many relationship reference were unidirectional since there would be no owner of the relationship to refer to.

Unfortunately, JPA does not support unidirectional one-to-many relationships using a foreign key on the target table and you cannot use the following mapping if you have a unidirectional one-to-many relationship between Item and Bid:

   @OneToMany(cascade=CascadeType.ALL)
@JoinColumn(name="ITEM_ID", referencedColumnName="BID_ITEM_ID")
protected Set<Bid> bids;

Although many persistence providers will support this mapping, this support is not standardized and you have to use a join or intersection table using the @JoinTable annotation similar to the many-to-many relationship that we discuss in section 8.3.3. Unidirectional one-to-many relationships are scarce, and we’ll leave this you to explore on your own. We recommend that you convert your relationship to bidirectional thereby avoiding the complexities involved in maintaining another table.

Also, the @ManyToOne annotation does not support the mappedBy element since it is always on the side of the relationship that holds the foreign key, and the inverse side can never be the relationship owner.

A final point to remember is that foreign keys can refer back to the primary key of the same table it resides in. There is nothing stopping a @JoinColumn annotation from specifying such a relationship. For example, the many-to-one relationship between subcategories and parent categories could be expressed as shown in listing 8.10.

Listing 8.10. Many-to-one self-referencing relationship mapping

In listing 8.10, the Category entity refers to its parent through the PARENT_ID foreign key pointing to the primary key value of another record in the CATEGORY table. Since multiple subcategories can exist under a single parent, the @Join-Column annotation specifies a many-one relationship .

We’ll conclude this section on mapping domain relationships by dealing with the most complex relationship mapping: many-to-many.

8.3.3. Many-to-many

As we mentioned in section 8.1.1, a many-to-many relationship in the database world is implemented by breaking it down into two one-to-many relationships stored in an association or join table. In other words, an association or join table allows you to indirectly match up primary keys from either side of the relationship by storing arbitrary pairs of foreign keys in a row. This scheme is shown in figure 8.7.

Figure 8.7. Many-to-many relationships are modeled in the database world using join tables. A join table essentially pairs foreign keys pointing to primary keys on either side of the relationship.

The @JoinTable mapping in EJB 3 models this technique. To see how this is done, take a look at the code in listing 8.11 for mapping the many-to-many relationship between the Item and Category entities. Recall that a Category can contain multiple Items and an Item can belong to multiple Category entities.

Listing 8.11. Many-to-many relationship mapping

The @JoinTable annotation’s name element specifies the association or join table, which is named CATEGORIES_ITEMS in our case . The CATEGORIES_ITEMS table contains only two columns: CI_CATEGORY_ID and CI_ITEMS_ID. The CI_CATEGORY_ID column is a foreign key reference to the primary key of the CATEGORIES table, while the CI_ITEM_ID column is a foreign key reference to the primary key of the ITEMS table. The joinColumns and inverseJoinColumns elements indicate this. Each of the two elements describes a join condition on either side of the many-to-many relationship. The joinColumns element describes the “owning” relationship between the Category and Item entities, and the inverseJoinColumns element describes the “subordinate” relationship between them. Note the distinction of the owning side of the relationship is purely arbitrary.

Just as we used the mappedBy element to reduce redundant mapping for one-to-many relationships, we are using the mappedBy element on the Item.categories field to point to the @JoinTable definition in Category.items. We can specify more than one join column with the @JoinColumns annotation if we have more than one column that constitutes the foreign key (again, this is an unlikely situation that should be avoided in a clean design). From the perspective of the Category entity, the persistence provider will determine what Item entities go in the items collection by setting the available CATEGORY_ID primary key against the combined joins defined in the @JoinTable annotation, figuring out what CI_ITEM_ID foreign keys match, and retrieving the matching records from the ITEMS table.

The flow of logic is essentially reversed for populating Item.categories. While saving the relationship into the database, the persistence provider might need to update all three of the ITEMS, CATEGORIES, and CATEGORIES_ITEMS tables as necessary. For a typical change in relational data, the ITEMS and CATEGORIES tables will remain unchanged while the foreign key references in the CATEGORIES_ITEMS table will change. This might be the case if we move an item from one category to the other, for example. Because of the inherent complexity of many-to-many mappings, the mappedBy element of the @ManyToMany annotation shines in terms of reducing redundancy.

If you have a unidirectional many-to-many relationship, then the only difference is that the inverse side of the relationship does not contain the mappedBy element.

We have now finished discussing entity relational mapping and will tackle mapping OO inheritance next before concluding the chapter.

8.4. Mapping inheritance

In section 8.1.1 we mentioned the difficulties in mapping OO inheritance into relational databases. We also alluded to the three strategies used to solve this problem: putting all classes in the OO hierarchy in the same table, using joined tables for the super- and subclasses, or using completely separate tables for each class. We’ll explore how each strategy is actually implemented in this section.

Recall from section 8.2.7 that we could easily utilize different strategies for generating sequences more or less by changing configuration parameters for the @GeneratedValue annotation. The @Inheritance annotation used to map OO inheritance tries to follow the same philosophy of isolating strategy-specific settings into the configuration. We’ll explore inheritance mapping using the three strategies offered through the @Inheritance annotation by implementing a familiar example.

As we mentioned earlier, the ActionBazaar system has several user types, including sellers and bidders. We have also introduced the idea of creating a User superclass common to all user types. In this scheme of things, the User entity would hold data and behavior common to all users, while subclasses like Bidder and Seller would hold data and behavior specific to user types. Figure 8.8 shows a simplified class diagram for this OO hierarchy.

Figure 8.8. The ActionBazaar user hierarchy. Each user type like Bidder and Seller inherit from the common User superclass. The empty arrow signifies there may be some other subclasses of the User class.

In this section you’ll learn how all the entities in the hierarchy in figure 8.8 can be mapped to database tables using different types of inheritance mapping strategies supported by JPA:

  • Single table
  • Joined tables
  • Table per class

You’ll also learn about polymorphic relationships.

8.4.1. Single-table strategy

In the single-table strategy, all classes in the inheritance hierarchy are mapped to a single table. This means that the single table will contain a superset of all data stored in the class hierarchy. Different objects in the OO hierarchy are identified using a special column called a discriminator column. In effect, the discriminator column contains a value unique to the object type in a given row. The best way to understand this scheme is to see it implemented. For the ActionBazaar schema, assume that all user types, including Bidders and Sellers, are mapped into the USERS table. Figure 8.9 shows how the table might look.

Figure 8.9. Storing all ActionBazaar user types using a single table

As figure 8.9 depicts, the USERS table contains data common to all users (such as USER_ID and USERNAME), Bidder-specific data (such as BID_FREQUENCY) and Seller-specific data (such as CREDIT_WORTH). Records 1 and 2 contain Bidder records while record 3 contains a Seller record. This is indicated by the B and S values in the USER_TYPE column. As you can imagine, the USER_TYPE discriminator column can contain values corresponding to other user types, such as A for admin or C for CSR. The persistence provider maps each user type to the table by storing persistent data into relevant mapped columns, setting the USER_TYPE value correctly and leaving the rest of the values NULL.

The next step to understanding the single-table strategy is to analyze the actual mapping implementation. Listing 8.12 shows the mapping for the User, Bidder, and Seller entities.

Listing 8.12. Inheritance mapping using a single table

The inheritance strategy and discriminator column has to be specified on the root entity of the OO hierarchy. In listing 8.12, we specify the strategy to be InheritanceType.SINGLE_TABLE in the @Inheritance annotation on the User entity . The @Table annotation on the User entity specifies the name of the single table used for inheritance mapping, USERS. The @DiscriminatorColumn annotation specifies the details of the discriminator column. The name element specifies the name of the discriminator, USER_TYPE. The discriminatorType element specifies the data type of the discriminator column, which is String, and the length element specifies the size of the column, 1. Both subclasses of User, Bidder, and Seller specify a discriminator value using the @DiscriminatorValue annotation. The Seller class specifies its discriminator value to be S . This means that when the persistence provider saves a Seller object into the USERS table, it will set the value of the USER_TYPE column to S. Similarly, Seller entities would only be reconstituted from rows with a discriminator value of S. Likewise, the Bidder subclass specifies its discriminator value to be B . Every subclass of User is expected to specify an appropriate discriminator value that doesn’t conflict with other subclasses. If you don’t specify a discriminator value for a subclass, the value is assumed to be the name of the subclass (such as Seller for the Seller entity).

Single table is the default inheritance strategy for EJB 3. Although this strategy is simple to use, it has one great disadvantage that might be apparent from figure 8.5. It does not fully utilize the power of the relational database in terms of using primary/foreign keys and results in a large number of NULL column values.

To understand why, consider the Seller record (number 3) in figure 8.9. The BID_FREQUENCY value is set to NULL for this record since it is not a Bidder record and the Seller entity does not map this column. Conversely, none of the Bidder records ever populate the CREDIT_WORTH column. It is not that hard to imagine the quantity of redundant NULL-valued columns in the USERS table if there are a significant numbers of users (such as a few thousand).


Note

For the very same reason, the strategy also limits the ability to enforce data integrity constraints. For example, if you want to enforce a column constraint that BID_FREQUENCY cannot be NULL for a Bidder, you would not be able to enforce the constraint since the same column will contain Seller records for which the value may be NULL. Typically, such constraints are enforced through alternative mechanisms such as database triggers for the single-table strategy.


The second inheritance strategy avoids the pitfalls we’ve described and fully utilizes database relationships.

8.4.2. Joined-tables strategy

The joined-tables inheritance strategy uses one-to-one relationships to model OO inheritance. In effect, the joined-tables strategy involves creating separate tables for each entity in the OO hierarchy and relating direct descendants in the hierarchy with one-to-one relationships. For a better grasp, let’s see how the data in figure 8.10 might look using this strategy.

Figure 8.10. Modeling inheritance using joined tables. Each entity in the OO hierarchy corresponds to a separate table and parent-child relationships are modeled using one-to-one mapping.

In the joined-tables strategy, the parent of the hierarchy contains only columns common to its children. In our example, the USERS table contains columns common to all ActionBazaar user types (such as USERNAME). The child table in the hierarchy contains columns specific to the entity subtype. Here, both the BIDDERS and SELLERS tables contain columns specific to the Bidder and Seller entities, respectively (for example, the SELLERS table contains the CREDIT_WORTH column). The parent-child OO hierarchy chain is implemented using one-to-one relationships. For example, the USERS and SELLERS tables are related through the USER_ID foreign key in the SELLERS table pointing to the primary key of the USERS table. A similar relationship exists between the BIDDERS and USERS tables. The discriminator column in the USERS table is still used, primarily as a way of easily differentiating data types in the hierarchy. Listing 8.13 shows how the mapping strategy is implemented in EJB 3.

Listing 8.13. Inheritance mapping using joined tables

Listing 8.13 uses the @DiscriminatorColumn and @DiscriminatorValue annotations in exactly the same way as the single-table strategy. The @Inheritance annotation’s strategy element is specified as JOINED, however. In addition, the one-to-one relationships between parent and child tables are implemented through the @PrimaryKeyJoinColumn annotations in both the Seller and Bidder entities. In both cases, the name element specifies the USER_ID foreign key. The joined-tables strategy is probably the best mapping choice from a design perspective. From a performance perspective, it is worse than the single-table strategy because it requires the joining of multiple tables for polymorphic queries.

8.4.3. Table-per-class strategy

Table-per-class is probably the simplest inheritance strategy for a layperson to understand. However, this inheritance strategy is the worst from both a relational and OO standpoint. In this strategy, both the superclass (concrete class) and subclasses are stored in their own table and no relationship exists between any of the tables. To see how this works, take a look at the tables in figure 8.11.

Figure 8.11. The table-per-class inheritance strategy. Super- and subclasses are stored in their own, entirely unrelated tables.

As figure 8.11 shows, entity data are stored in their own tables even if they are inherited from the superclass. This is true even for the USER_ID primary key. As a result, primary keys in all tables must be mutually exclusive for this scheme to work. In addition, inherited columns are duplicated across tables, such as the USERNAME column. Using this inheritance strategy, we define the strategy in the superclass and map tables for all the classes. Listing 8.14 shows how the code might look.

Listing 8.14. Inheritance mapping using the table-per-class strategy

As you can see, the inheritance strategy is specified in the superclass entity, User. However, all of the concrete entities in the OO hierarchy are mapped to separate tables and each have a @Table specification . The greatest disadvantage of using this mapping type is that it does not have good support for polymorphic relations or queries because each subclass is mapped to its own table.

The limitation is that when you want to retrieve entities over the persistence provider, you must use SQL UNION or retrieve each entity with separate SQL for each subclass in the hierarchy.

Besides being awkward, this strategy is the hardest for an EJB 3 provider to implement reliably. As a result, implementing this strategy has been made optional for the provider by the EJB 3 specification. We recommend that you avoid this strategy altogether.

This completes our analysis of the three strategies for mapping OO inheritance. Choosing the right strategy is not as straightforward as you might think. Table 8.2 compares each strategy.

Table 8.2. The EJB 3 JPA supports three different inheritance strategies. The table-per-class choice is optional and the worst of these strategies.

Feature

Single Table

Joined Tables

Table per Class

Table support

One table for all classes in the entity hierarchy:

  • Mandatory columns may be nullable.
  • The table grows when more subclasses are added.

One for the parent class, and each subclass has a separate table to store polymorphic properties

Mapped tables are normalized.

One table for each concrete class in the entity hierarchy

Uses discriminator column?

Yes

Yes

No

SQL generated for retrieval of entity hierarchy

Simple SELECT

SELECT clause joining multiple tables

One SELECT for each subclass or UNION of SELECT

SQL for insert and update

Single INSERT or UPDATE for all entities in the hierarchy

Multiple INSERT, UPDATE: one for the root class and one for each involved subclass

One INSERT or UPDATE for every subclass

Polymorphic relationship

Good

Good

Poor

Polymorphic queries

Good

Good

Poor

Support in EJB 3 JPA

Required

Required

Optional

The single-table strategy is relatively simple and is fairly performance friendly since it avoids joins under the hood. Even inserts and updates in the single-table strategy perform better when compared to the joined-tables strategy. This is because in the joined-tables strategy, both the parent and child tables need to be modified for a given entity subclass. However, as we mentioned, the single-table strategy results in a large number of NULL-valued columns. Moreover, adding a new subclass essentially means updating the unified table each time.

In the joined-tables strategy, adding a subclass means adding a new child table as opposed to altering the parent table (which may or may not be easier to do). By and large, we recommend the joined-tables strategy since it best utilizes the strengths of the relational database, including the ability to use normalization techniques to avoid redundancy. In our opinion, the performance cost associated with joins is relatively insignificant, especially with surrogate keys and proper database indexing.

The table-per-class strategy is probably the worst choice of the three. It is relatively counterintuitive, uses almost no relational database features, and practically magnifies the object relational impedance instead of attempting to bridge it. The most important reason to avoid this strategy, however, is that EJB 3 makes it optional for a provider to implement it. As a result, choosing this strategy might make your solution nonportable across implementations.

Beside these inheritance strategies, the EJB 3 JPA allows an entity to inherit from a nonentity class. Such a class is annotated with @MappedSuperClass. Like an embeddable object, a mapped superclass does not have an associated table.


Eclipse ORM tool

As we mentioned earlier, EJB 3 makes O/R mapping a lot easier, but not quite painless. This is largely because of the inherent complexity of ORM and the large number of possible combinations to handle. The good news is that a project to create an Eclipse-based EJB 3 mapping tool code-named Dali is under way (www.eclipse.org/dali/). The project is led by Oracle and supported by JBoss and BEA. It will be a part of Eclipse web tool project (WTP) and will support creating and editing EJB 3 O/R mappings using either annotations or XML. Also, two commercial products, Oracle JDeveloper and BEA Workshop, support development of EJB 3 applications.


8.4.4. Mapping polymorphic relationships

In chapter 7 we explained that JPA fully supports inheritance and polymorphism. Now that we have completed our examination of mapping relationships and inheritance, you are probably anxious to learn about polymorphic associations. A relationship between two entities is said to be polymorphic when the actual relationship may refer to instances of a subclass of the associated entity. Assume that there is a bidirectional many-to-one relationship between the ContactInfo and User entities. We discussed earlier that User is an abstract entity and entities such as Bidder, Seller, and so forth inherit from the User class. When we retrieve the relationship field from the ContactInfo entity, the retrieved instance of the User entity will be an instance of its subclass, either Bidder or Seller. The greatest benefit of JPA is that you don’t have to do any extra work to map polymorphic associations; you just define the relationship mapping between the superclass and the associated entity and the association becomes polymorphic.

8.5. Summary

In this chapter, we explored basic database concepts and introduced the object-relational impedance problem. We reviewed ORM annotations such as @Table and @Column, and mapped some of the entities into database tables.

We also reviewed the various types of primary key generation strategies as well as the mapping of composite primary keys.

In addition, we examined the different types of relationships introduced in the previous chapter and showed you how to use JPA annotations such as @JoinColumn and @PrimaryKeyColumn to map those into database tables. You learned that many-to-many and unidirectional one-to-many relationships require association tables. We hope that the limitation to support unidirectional relationships using a target foreign key constraint will be addressed in a future release.

We showed you the robust OO inheritance mapping features supported by JPA and compared their advantages and disadvantages. Of the three inheritance mapping strategies, using joined tables is the best from a design perspective.

You should note, however, that we avoided some complexities in this chapter. First, we used field-based persistence in all of our code samples to keep them as short and simple as possible. Second, we only mentioned the most commonly used elements for the annotations featured in the chapter. We felt justified in doing so as most of the annotation elements we avoided are rarely used. We do encourage you to check out the full definition of all the annotations in this chapter available online at http://java.sun.com/products/persistence/javadoc-1_0-fr/javax/persistence/package-tree.html.

In the next chapter, you’ll learn how to manipulate the entity and relationships we mapped in this chapter using the EntityManager API.