Follow @vlad_mihalcea Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In my previous post I talked about different database identifier strategies. This post will compare the most common surrogate primary key strategies:

IDENTITY

SEQUENCE

TABLE (SEQUENCE)

IDENTITY

The IDENTITY type (included in the SQL:2003 standard) is supported by:

The IDENTITY generator allows an integer/bigint column to be auto-incremented on demand. The increment process happens outside of the current running transaction, so a roll-back may end-up discarding already assigned values (value gaps may happen).

The increment process is very efficient since it uses a database internal lightweight locking mechanism as opposed to the more heavyweight transactional course-grain locks.

The only drawback is that we can’t know the newly assigned value prior to executing the INSERT statement. This restriction is hindering the transactional write-behind strategy adopted by Hibernate. For this reason, Hibernates cannot use JDBC batching when persisting entities that are using the IDENTITY generator.

For the following examples we’ll enable Session Factory JDBC batching:

properties.put("hibernate.order_inserts", "true"); properties.put("hibernate.order_updates", "true"); properties.put("hibernate.jdbc.batch_size", "2");

Let’s define an Entity using the IDENTITY generation strategy:

@Entity(name = "identityIdentifier") public class IdentityIdentifier { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; }

Persisting 5 entities:

doInTransaction(new TransactionCallable<Void>() { @Override public Void execute(Session session) { for (int i = 0; i < 5; i++) { session.persist(new IdentityIdentifier()); } session.flush(); return null; } });

Will execute one query after the other (there is no JDBC batching involved):

Query:{[insert into identityIdentifier (id) values (default)][]} Query:{[insert into identityIdentifier (id) values (default)][]} Query:{[insert into identityIdentifier (id) values (default)][]} Query:{[insert into identityIdentifier (id) values (default)][]} Query:{[insert into identityIdentifier (id) values (default)][]}

Aside from disabling JDBC batching, the IDENTITY generator strategy doesn’t work with the TABLE_PER_CLASS inheritance model because there could be multiple subclass entities having the same identifier and a base class query will end up retrieving entities with the same identifier (even if belonging to different types).

SEQUENCE

The SEQUENCE generator (defined in the SQL:2003 standard) is supported by:

A SEQUENCE is a database object that generates incremental integers on each successive request. SEQUENCES are much more flexible than IDENTIFIER columns because:

A SEQUENCE is table free and the same sequence can be assigned to multiple columns or tables

A SEQUENCE may preallocate values to improve performance

A SEQUENCE may define an incremental step, allowing us to benefit from a “pooled” Hilo algorithm

A SEQUENCE doesn’t restrict Hibernate JDBC batching

A SEQUENCE doesn’t restrict Hibernate inheritance models

Let’s define a Entity using the SEQUENCE generation strategy:

@Entity(name = "sequenceIdentifier") public class SequenceIdentifier { @Id @GenericGenerator( name = "sequence", strategy = "sequence", parameters = { @org.hibernate.annotations.Parameter( name = "sequence", value = "sequence" ) }) @GeneratedValue(generator = "sequence") private Long id; }

I used the “sequence” generator because I wanted to use the SequenceGenerator , rather than SequenceHiLoGenerator or SequenceStyleGenerator which Hibernate might choose on our behalf.

Adding 5 entities:

doInTransaction(new TransactionCallable<Void>() { @Override public Void execute(Session session) { for (int i = 0; i < 5; i++) { session.persist(new SequenceIdentifier()); } session.flush(); return null; } });

Generate the following queries:

Query:{[call next value for hibernate_sequence][]} Query:{[call next value for hibernate_sequence][]} Query:{[call next value for hibernate_sequence][]} Query:{[call next value for hibernate_sequence][]} Query:{[call next value for hibernate_sequence][]} Query:{[insert into sequenceIdentifier (id) values (?)][1]} {[insert into sequenceIdentifier (id) values (?)][2]} Query:{[insert into sequenceIdentifier (id) values (?)][3]} {[insert into sequenceIdentifier (id) values (?)][4]} Query:{[insert into sequenceIdentifier (id) values (?)][5]}

This table the inserts are batched, but we now have 5 sequence calls prior to inserting the entities. This can be optimized by using a HILO algorithm.

TABLE (SEQUENCE)

There is another database-independent alternative to generating sequences. One or multiple tables can be used to hold the identifier sequence counter. But it means trading write performance for database portability.

While IDENTITY and SEQUENCES are transaction-less, using a database table mandate ACID, for synchronizing multiple concurrent id generation requests.

This is made possible by using row-level locking which comes at a higher cost than IDENTITY or SEQUENCE generators.

The sequence must be calculated in a separate database transaction and this requires the IsolationDelegate mechanism, which has support for both local (JDBC) and global(JTA) transactions.

For local transactions, it must open a new JDBC connection, therefore putting more pressure on the current connection pooling mechanism.

For global transactions, it requires suspending the current running transaction. After the sequence value is generated, the actual transaction has to be resumed. This process has its own cost, so the overall application performance might be affected.

Let’s define a Entity using the TABLE generation strategy:

@Entity(name = "tableIdentifier") public class TableSequenceIdentifier { @Id @GenericGenerator( name = "table", strategy = "enhanced-table", parameters = { @org.hibernate.annotations.Parameter( name = "table_name", value = "sequence_table" ) }) @GeneratedValue(generator = "table", strategy=GenerationType.TABLE) private Long id; }

I used the newer “enhanced-table” generator because the legacy “table” generator has been deprecated.

Adding 5 entities:

doInTransaction(new TransactionCallable<Void>() { @Override public Void execute(Session session) { for (int i = 0; i < 5; i++) { session.persist(new TableSequenceIdentifier()); } session.flush(); return null; } });

Generate the following queries:

Query:{[select tbl.next_val from sequence_table tbl where tbl.sequence_name=? for update][default]} Query:{[insert into sequence_table (sequence_name, next_val) values (?,?)][default,1]} Query:{[update sequence_table set next_val=? where next_val=? and sequence_name=?][2,1,default]} Query:{[select tbl.next_val from sequence_table tbl where tbl.sequence_name=? for update][default]} Query:{[update sequence_table set next_val=? where next_val=? and sequence_name=?][3,2,default]} Query:{[select tbl.next_val from sequence_table tbl where tbl.sequence_name=? for update][default]} Query:{[update sequence_table set next_val=? where next_val=? and sequence_name=?][4,3,default]} Query:{[select tbl.next_val from sequence_table tbl where tbl.sequence_name=? for update][default]} Query:{[update sequence_table set next_val=? where next_val=? and sequence_name=?][5,4,default]} Query:{[select tbl.next_val from sequence_table tbl where tbl.sequence_name=? for update][default]} Query:{[update sequence_table set next_val=? where next_val=? and sequence_name=?][6,5,default]} Query:{[insert into tableIdentifier (id) values (?)][1]} {[insert into tableIdentifier (id) values (?)][2]} Query:{[insert into tableIdentifier (id) values (?)][3]} {[insert into tableIdentifier (id) values (?)][4]} Query:{[insert into tableIdentifier (id) values (?)][5]}

The table generator allows JDBC batching but it resorts to SELECT FOR UPDATE queries. The row-level locking is definitely less efficient than using a native IDENTITY or SEQUENCE.

Conclusion

So, based on your application requirements you have multiple options to choose from. There isn’t one single winning strategy, each one having both advantages and disadvantages.

Code available on GitHub.

Insert details about how the information is going to be processed DOWNLOAD NOW