Tuesday, March 13, 2012

JPA inheritance SINGLE_TABLE JOINED TABLE_PER_CLASS

The advantage of using a O/R mapper like Hibernate by use of the JPA specification is for sure that one can develop independently of a certain database system and for a programmer you don't have to change the object-oriented world for doing persistence.
Therefore you want definitely use inheritance with JPA.
There are different strategies for realizing inheritance:
Default is SINGLE_TABLE, if only the annotation @Inheritance has been assigned.
There are 2 other strategies possible:

@Inheritance(strategy=InheritanceType.JOINED) und
@Inheritance(strategy=InheritanceType.TABLE_PER_CLASS).

So what do these strategies mean?
The default(SINGLE_TABLE) maps all sub classes of the inheritance chain in one database table.
With JOINED all abstract classes and concrete classes of the inheritance chain are stored in separate tables.
With TABLE_PER_CLASS every concrete class of the inheritance chain gets its own dedicated table.

What strategy should be used in what situation?

The answer of this question derives of mainly 2 aspects:
  • Polymorphy
  • Performance
Everybody wants to have high performance - but there are situations in which flexibility and expandability are more important than performance: e.g. new features, configurations or subordinate use cases.
There are situations in which the inheritance chain refers to important transaction data and affects core use cases - in that case performance is more important that flexibility or expandability.

SINGLE_TABLE has the advantage of a high performance data access, because everything is stored in one table. If the concrete subclasses highly differ and the amount of very different sub classes is high, the result is a wide table. That may result in an unfavorable tablespace and less rows of the table will be cached by the database. In this case a SINGLB_TABLE strategy is a bad decision.

Disadvantage data integrity:
With SINGLE_TABLE all not primary keys must be nullable, because all sub classes are stored in this table 
and therefore some types will have no values for certain columns.
If you do know that the most and the most important data access situations want to fetch concrete types from persistence context, TABLE_PER_CLASS is a good choice.
With this strategy polymorphic queries have very poor performance because of the resulting UNION queries generated by the JPA provider like Hibernate and the use of polymorphic associations is not possible, because the abstract types are not stored in the database table.
With JOINED polymorphic queries are possible because of the @DescriminiatorValue which results in a column for distinguishing the several subclasses. Those queries are realized with the help of outer joins - much better in performance than UNION selects. Concrete types can be queried with help of inner joins.
There the name of the strategy comes from.


Realization
There is only one abstract class. The subclasses inherit from this class and they all are marked with @Entity.
With SINGLE_TABLE and JOINED according to the JPA standard a @DiscriminatorValue must be provided. It is of type string and it's name is per default dtype and is used for distinguishing the concrete classes from each other: in the database of this table the corresponding string value is stored in the column dtype.

@Inheritance(strategy=InheritanceType.SINGLE_TABLE)
public abstract class BaseClass implements Serializable {
@Id
@GeneratedValue
private Long id;
private String common;
 ...
}

@Entity
@DiscriminatorValue(value="A")
public class A extends BaseClass {
 private String specificA;
...
}
@Entity
@DiscriminatorValue(value="B")

public class B extends BaseClass {
 private String specificB;
 ...
}
Following table is the result:
dtype|id|common|specificA|specificB

Having an instance A the column specificB will always be null.
On the other side polymorphic queries(select * from BaseClass) or querying a concrete subclass (select * from A ...) are lightning fast.

If it would be TABLE_PER_CLASS polymorphic queries are not possible.
With JOINED all queries after concrete subclasses are inner joins,
polymorphic queries outer joins.
So you could say JOINED is a trade off between performance and expandability/flexibility.







No comments:

Post a Comment