Follow @vlad_mihalcea Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In this article, we are going to see how the UUID entity attributes are persisted when using JPA and Hibernate, for both assigned and auto-generated identifiers.

In my previous post I talked about UUID surrogate keys and the use cases when there are more appropriate than the more common auto-incrementing identifiers.

A UUID database type

There are several ways to represent a 128-bit UUID, and whenever in doubt I like to resort to Stack Exchange for an expert advice.

Because table identifiers are usually indexed, the more compact the database type the less space will the index require. From the most efficient to the least, here are our options:

Some databases (PostgreSQL, SQL Server) offer a dedicated UUID storage type Otherwise we can store the bits as a byte array (e.g. RAW(16) in Oracle or the standard BINARY(16) type) Alternatively we can use 2 bigint (64-bit) columns, but a composite identifier is less efficient than a single column one We can store the hex value in a CHAR(36) column (e.g 32 hex values and 4 dashes), but this will take the most amount of space, hence it’s the least efficient alternative

Hibernate offers many identifier strategies to choose from and for UUID identifiers we have three options:

the assigned generator accompanied by the application logic UUID generation

the hexadecimal “uuid” string generator

string generator the more flexible “uuid2” generator, allowing us to use java.lang.UUID, a 16 byte array or a hexadecimal String value

The Hibernate UUID assigned generator

The assigned generator allows the application logic to control the entity identifier generation process. By simply omitting the identifier generator definition, Hibernate will consider the assigned identifier. This example uses a BINARY(16) column type, since the target database is HSQLDB.

@Entity(name = "Post") @Table(name = "post") public class Post { @Id @Column(columnDefinition = "BINARY(16)") private UUID id = UUID.randomUUID(); private String title; public UUID getId() { return id; } public Post setId(UUID id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }

Persisting an Entity:

entityManager.persist( new Post().setTitle("High-Performance Java Persistence") );

Generates exactly one INSERT statement:

INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [72, 101, 87, -123, -35, 18, 65, -21, -84, -90, 83, -104, -112, -41, -62, -54] )

Let’s see what happens when issuing a merge instead:

entityManager.merge( new Post().setTitle("High-Performance Java Persistence") );

We get both a SELECT and an INSERT this time:

SELECT p.id as id1_0_0_, p.title as title2_0_0_ FROM post p WHERE p.id = [32, -57, 116, 87, 106, 104, 76, -95, -102, 25, -74, 119, 30, -50, -12, -84] INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [32, -57, 116, 87, 106, 104, 76, -95, -102, 25, -74, 119, 30, -50, -12, -84] )

The persist method takes a transient entity and attaches it to the current Hibernate entityManager. If there is an already attached entity or if the current entity is detached we’ll get an exception.

The merge operation will copy the current object state into the existing persisted entity (if any). This operation works for both transient and detached entities, but for transient entities persist is much more efficient than the merge operation.

For assigned identifiers, a merge will always require a select since Hibernate cannot know if there is already a persisted entity having the same identifier. For other identifier generators, Hibernate looks for a null identifier to figure out if the entity is in the transient state.

That’s why the Spring Data SimpleJpaRepository#save(S entity) method is not the best choice for Entities using an assigned identifier:

@Transactional public <S extends T> S save(S entity) { if (entityInformation.isNew(entity)) { em.persist(entity); return entity; } else { return em.merge(entity); } }

For assigned identifiers, this method will always pick merge instead of persisting, hence you will get both a SELECT and an INSERT for every newly inserted entity.

The auto-generated Hibernate UUID identifiers

This time, we won’t assign the identifier ourselves but have Hibernate generate it on our behalf. When a null identifier is encountered, Hibernate assumes a transient entity, for whom it generates a new identifier value. This time, the merge operation won’t require a select query prior to inserting a transient entity.

The UUIDHexGenerator

The UUID hex generator is the oldest UUID identifier generator and it’s registered under the “uuid” type. It can generate a 32 hexadecimal UUID string value (it can also use a separator) having the following pattern: 8{sep}8{sep}4{sep}8{sep}4.

This generator is not IETF RFC 4122 compliant, which uses the 8-4-4-4-12 digit representation.

@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue(generator = "uuid") @GenericGenerator(name = "uuid", strategy = "uuid") @Column(columnDefinition = "CHAR(32)") private String id; private String title; public String getId() { return id; } public Post setId(String id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }

When persisting the Post entity:

entityManager.persist( new Post().setTitle("High-Performance Java Persistence") );

Hibernate generates the following SQL INSERT statement:

INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', 8a80cb8172c0e9ff0172c0ea02e40000 )

And, when merging a transient Post entity:

entityManager.merge( new Post().setTitle("High-Performance Java Persistence") );

Hibernate generates a single SQL INSERT statement without needing a SELECT query:

INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', 8a80cb8172c0e9ff0172c0ea03030001 )

The UUIDGenerator

The newer UUID generator is IETF RFC 4122 compliant (variant 2) and it offers pluggable generation strategies. It’s registered under the uuid2 type and it offers a broader type range to choose from:

java.lang.UUID

a 16 byte array

a hexadecimal String value

Because the uuid2 generator is the default strategy used by Hibernate, you don’t need to declare it explicitly. If the entity identifier is of the UUID type and the entity identifier uses the @GeneratedValue annotation, then the uuid2 generator strategy is going to be used:

@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue @Column(columnDefinition = "BINARY(16)") private UUID id; private String title; public UUID getId() { return id; } public Post setId(UUID id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }

Persisting or merging a transient entity:

When persisting the Post entity:

entityManager.persist( new Post().setTitle("High-Performance Java Persistence") );

Hibernate generates the following SQL INSERT statement:

INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [90, 17, 87, -73, -69, 81, 77, -47, -102, 110, 74, -4, 85, -74, -24, -95] )

And, when merging a transient Post entity:

entityManager.merge( new Post().setTitle("High-Performance Java Persistence") );

Hibernate generates a single SQL INSERT statement:

INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [-38, 35, 2, -55, 65, -127, 70, -51, -68, -34, 117, 111, -40, 4, -26, 63] )

These SQL INSERT queries are using a byte array as we configured the @Id column definition.

Conclusion

While you can use a UUID entity identifier with JPA and Hibernate, it’s not always the right choice. First of all, the UUID requires 128 bits, and this problem can be amplified by Foreign Key columns. Since Primary Key and Foreign Keys columns are usually indexes, the extra storage requirement will impact indexes as well.

That’s the reason why numerical entity identifiers are usually a much better option, especially when being generated by a database sequence.

Insert details about how the information is going to be processed DOWNLOAD NOW