Monday, March 12, 2012

Treatment of IDs / performance and optimization


If a Object/Relation Mapper like e.g. Hibernate is used, then every managed entity, which will be made persistent, must have an ID.
There must be something, which clearly identifies an entity.
In JPA the annotation @Id is used.


The attributes for identification can be chosen by the following aspects:

  1. Usage of a more professional key
  2. Usage/generation of a technical key
Alternative 1 can not be found for every entity.
Alternative 2 means that for new objects new IDs have to be generated.


JPA Generators

With JPA generators are been used:
@GeneratedValue

There are 3 generators, one of them is selected by the annotation parameter strategy:
  1. IDENTITY
  2. SEQUENCE
  3. TABLE
  4. AUTO

IDENTITY

With IDENTITY an autoincrement-column is used, if the underlying database can support this strategy.


SEQUENCE

With SEQUENCE an decidedly databased generator is used, who is in charge for incrementing the value.


TABLE

With TABLE a table hibernate_sequences is used, which holds the values of the generator.

AUTO

With AUTO hibernate decides on account of the configured database dialect, which generator should be used in particular.

JPA PERFORMANCE

Performance
Basically a technical key means poorer performance, because during persist the EntityManager/HibernateSessionFactory has to read / collect the assigned value of the database.
Those databases which do not provide these important operations per API suffer from bad insert performance.
Is there no special operation or opportunity for efficient read of this data, poorer insert operations will be recognized as in other database systems.
The AUTO-strategy can destroy performance on certain database systems.
 On a oracle database (and also e.g. in a DB2 database) a sequence will be used, BUT with an allocationSize of 1.

This can not be configured with the AUTO-strategy - what means that
the insert performance is bad, because after every insert the next value of the sequence has to be collected using the database driver.
By default generating manually or by use of a database tool on DB2/Oracle, usually  the next 20 IDs will be cached, which can than be used by the EntityManager in order to avoid asking the sequence object after every insert operation.

 CREATE SEQUENCE XY
      START WITH 1
      INCREMENT BY 1
      NO MAXVALUE
      NO CYCLE
      CACHE 20;
     INSERT INTO ORDERS (ORDERNO, CUSTNO)
       VALUES (NEXT VALUE FOR XY, 123456);


Performance @GeneratedValue

If performance is crucial the annotation @GeneratedVaule can be removed, so that the JPA provider is not in charge for generating the ID and setting the ID during the persist operation.
In that case one must generate the ID on your own - which can be regarding database systems and driver possibilities considerably faster.

UUID class

In the java.util-package there is since Java 1.5 a UUID class which can be used for generating UUIDs.
Eventually one should combine the generated value with a secure distinct information according to the context, so that the UUID will also be by chance unique.
The benefit is that after calling the persist method a look up by JPA / driver for the assigned value is not needed anymore.

A very good literature for this topic is found in the following book

(chapter 4 section mapping the Primary Key):



No comments:

Post a Comment