Records is a new preview feature in Java 14, providing a nice compact syntax to declare classes that are supposed to be dumb data holders. In this article, we’re going to see what Records looks like under the hood. So buckle up!

Class Representation

Let’s start with a very simple example:

Java







x



1 public record Range ( int min , int max ) {}





How about compiling this code using javac :

Java







xxxxxxxxxx 1



1 javac -- enable - preview - source 14 Range . java





Then, it’s possible to take a peek at the generated bytecode using javap :

Java







xxxxxxxxxx 1



1 javap Range





This will print the following:

Java







xxxxxxxxxx 1



1 Compiled from "Range.java" 2 public final class Range extends java . lang . Record { 3 public Range ( int , int ); 4 public java . lang . String toString (); 5 public final int hashCode (); 6 public final boolean equals ( java . lang . Object ); 7 public int min (); 8 public int max (); 9 }





Interestingly, similar to Enums, Records are normal Java classes with a few fundamental properties:

They are declared as final classes, so we can’t inherit from them.

classes, so we can’t inherit from them. They’re already inheriting from another class named java.lang.Record . Therefore, Records can’t extend any other class, as Java does not allow multiple-inheritance.

. Therefore, Records can’t extend any other class, as Java does not allow multiple-inheritance. Records can implement other interfaces.

For each component, there is an accessor method, e.g. max and min .

and . There are auto-generated implementations for toString , equals and hashCode based on all components.

, and based on all components. Finally, there is an auto-generated constructor that accepts all components as its arguments.

Also, the java.lang.Record is just an abstract class with a protected no-arg constructor and a few other basic abstract methods:

Java







xxxxxxxxxx 1 13



1 public abstract class Record { 2 3 protected Record () {} 4 5 6 public abstract boolean equals ( Object obj ); 7 8 9 public abstract int hashCode (); 10 11 12 public abstract String toString (); 13 }





Nothing special about this class!



You may also enjoy: Introducing Java Record



The Curious Case of Data Classes

Coming from a Kotlin or Scala background, one may spot some similarities between Records in Java, Data Classes in Kotlin and Case Classes in Scala. On the surface, they all share one very fundamental goal: to facilitate writing data holders.

Despite this fundamental similarity, things are very different at the bytecode level.

Kotlin’s Data Class

For the sake of comparison, let’s see a Kotlin data class equivalent of Range :

Kotlin







xxxxxxxxxx 1



1 data class Range ( val min : Int , val max : Int )





Similar to Records, Kotlin compiler generates accessor methods, default toString , equals , and hashCode implementations and a few more functions based on this simple one-liner.

Let’s see how the Kotlin compiler generates the code for, say, toString :

Kotlin







xxxxxxxxxx 1 23



1 Compiled from "Range.kt" 2 public java . lang . String toString (); 3 descriptor : () Ljava / lang / String ; 4 flags : ( 0x0001 ) ACC_PUBLIC 5 Code : 6 stack = 2 , locals = 1 , args_size = 1 7 0 : new #36 8 3 : dup 9 4 : invokespecial #37 10 7 : ldc #39 11 9 : invokevirtual #43 12 12 : aload_0 13 13 : getfield #10 14 16 : invokevirtual #46 15 19 : ldc #48 16 21 : invokevirtual #43 17 24 : aload_0 18 25 : getfield #16 19 28 : invokevirtual #46 20 31 : ldc #50 21 33 : invokevirtual #43 22 36 : invokevirtual #52 23 39 : areturn





We issued the javap -c -v Range to generate this output. Also, here we’re using the simple class names for the sake of brevity.

Anyway, Kotlin is using the StringBuilder to generate the string representation instead of multiple string concatenations (like any decent Java developer!). That is:

At first, it creates a new instance of StringBuilder (index 0, 3, 4).

(index 0, 3, 4). Then it appends the literal Range(min= string (index 7, 9).

string (index 7, 9). Then it appends the actual min value (index 12, 13, 16).

Then it appends the literal , max= (index 19, 21).

(index 19, 21). Then it appends the actual max value (index 24, 25, 28).

Then it closes the parentheses by appending the ) literal (index 31, 33).

literal (index 31, 33). Finally, it builds the StringBuilder instance and returns it (index 36, 39).

Basically, the more we have properties in our data class, the lengthier the bytecode and consequently longer startup time.

Scala’s Case Class

Let’s write the case class equivalent in Scala:

Scala







xxxxxxxxxx 1



1 case class Range ( min : Int , max : Int )





At first glance, Scala seems to generate a much simpler toString implementation:

Scala







xxxxxxxxxx 1 10



1 Compiled from "Range.scala" 2 public java . lang . String toString (); 3 descriptor : () Ljava / lang / String ; 4 flags : ( 0x0001 ) ACC_PUBLIC 5 Code : 6 stack = 2 , locals = 1 , args_size = 1 7 0 : getstatic # 89 8 3 : aload_0 9 4 : invokevirtual # 111 10 7 : areturn





However, the toString calls the scala.runtime.ScalaRunTime._toString static method. That, in turn, calls the productIterator method to iterate through this Product Type. This iterator calls the productElement method, which looks like:

Scala







xxxxxxxxxx 1 29



1 public java . lang . Object productElement ( int ); 2 descriptor : ( I ) Ljava / lang / Object ; 3 flags : ( 0x0001 ) ACC_PUBLIC 4 Code : 5 stack = 3 , locals = 3 , args_size = 2 6 0 : iload_1 7 1 : istore_2 8 2 : iload_2 9 3 : tableswitch { 10 0 : 24 11 1 : 34 12 default : 44 13 } 14 24 : aload_0 15 25 : invokevirtual # 55 16 28 : invokestatic # 71 17 31 : goto 59 18 34 : aload_0 19 35 : invokevirtual # 58 20 38 : invokestatic # 71 21 41 : goto 59 22 44 : new # 73 23 47 : dup 24 48 : iload_1 25 49 : invokestatic # 71 26 52 : invokevirtual # 76 27 55 : invokespecial # 79 28 58 : athrow 29 59 : areturn





This basically switches over all properties of the case class . For instance, if the productIterator wants the first property, it returns the min . Also, when the productIterator wants the second element, it will return the max value. Otherwise, it will throw an instance of IndexOutOfBoundsException to signal an out of bound request.

Again, the more we have properties in a case class , we would have more of those switch arms. Therefore, the bytecode length is proportional to the number of properties. In other words, the same problem as Kotlin’s data class .

Invoke Dynamic

Let’s take an even closer look to the bytecode generated for the Java Records:

Java







xxxxxxxxxx 1



1 Compiled from "Range.java" 2 public java . lang . String toString (); 3 descriptor : () Ljava / lang / String ; 4 flags : ( 0x0001 ) ACC_PUBLIC 5 Code : 6 stack = 1 , locals = 1 , args_size = 1 7 0 : aload_0 8 1 : invokedynamic #18 , 0 9 6 : areturn





Regardless of the number of record components, this will be the bytecode. A simple, polished and elegant solution. But how does this invokedynamic thing work?

Introducing Indy

Invoke Dynamic (also known as Indy) was part of JSR 292 intending to enhance the JVM support for dynamic languages. After its first release in Java 7, the invokedynamic opcode along with its java.lang.invoke luggage is used quite extensively by dynamic JVM-based languages like JRuby.

Although Indy was specifically designed to enhance the dynamic language support, it offers much more than that. As a matter of fact, it’s suitable to use wherever a language designer needs any form of dynamicity, from dynamic type acrobatics to dynamic strategies! For instance, the Java 8 Lambda Expressions are actually implemented using invokedynamic , even though Java is a statically typed language!

User-Definable Bytecode

For quite some time JVM did support four method invocation types: invokestatic to call static methods, invokeinterface to call interface methods, invokespecial to call constructors, super() or private methods and invokevirtual to call instance methods.

Despite their differences, these invocation types share one common trait: we can’t enrich them with our own logic. On the contrary, invokedynamic enables us to Bootstrap the invocation process in any way we want. Then the JVM takes care of calling the bootstrapped method directly.

How Does Indy Work?

The first time JVM sees an invokedynamic instruction, it calls a special static method called Bootstrap Method. The bootstrap method is a piece of Java code that we’ve written to prepare the actual to-be-invoked logic:

Then the bootstrap method returns an instance of java.invoke.CallSite . This CallSite holds a reference to the actual method, i.e. MethodHandle . From now on, every time JVM sees this invokedynamic instruction again, it skips the Slow Path and directly calls the underlying executable. The JVM continues to skip the slow path unless something changes.

Why Indy?

As opposed to the Reflection APIs, the java.lang.invoke API is quite efficient since the JVM can completely see through all invocations. Therefore, JVM may apply all sorts of optimizations as long as we avoid the slow path as much as possible!

In addition to the efficiency argument, the invokedynamic approach is more reliable and less brittle because of its simplicity.

Moreover, the generated bytecode for Java Records is independent of the number of properties. So, less bytecode and faster startup time.

Finally, let’s suppose a new version of Java includes a new and more efficient bootstrap method implementation. With invokedynamic , our app can take advantage of this improvement without recompilation. This way we have some sort of Forward Binary Compatibility. Also, That’s the dynamic strategy we were talking about!

The Object Methods

Now that we are familiar enough with Indy, let’s make sense of the invokedynamic in Records bytecode:

Java







xxxxxxxxxx 1



1 invokedynamic #18 , 0 2





Look what I found in the Bootstrap Method Table:

Java







xxxxxxxxxx 1



1 BootstrapMethods : 2 0 : #41 REF_invokeStatic java / lang / runtime / ObjectMethods . bootstrap :( Ljava / lang / invoke / MethodHandles$Lookup ; Ljava / lang / String ; Ljava / lang / invoke / TypeDescriptor ; Ljava / lang / Class ; Ljava / lang / String ;[ Ljava / lang / invoke / MethodHandle ;) Ljava / lang / Object ; 3 Method arguments : 4 #8 Range 5 #48 min ; max 6 #50 REF_getField Range . min : I 7 #51 REF_getField Range . max : I





So the bootstrap method for Records is called bootstrap which resides in the java.lang.runtime.ObjectMethods class. As you can see, this bootstrap method expects the following parameters:

An instance of MethodHandles.Lookup representing the lookup context (The Ljava/lang/invoke/MethodHandles$Lookup part).

representing the lookup context (The part). The method name (i.e. toString , equals , hashCode , etc.) the bootstrap is going to link. For example, when the value is toString , bootstrap will return a ConstantCallSite (a CallSite that never changes) that points to the actual toString implementation for this particular Record.

, , , etc.) the bootstrap is going to link. For example, when the value is , bootstrap will return a (a that never changes) that points to the actual implementation for this particular Record. The TypeDescriptor for the method ( Ljava/lang/invoke/TypeDescriptor part).

for the method ( part). A type token, i.e. Class<?> , representing the Record class type. It’s Class<Range> in this case.

, representing the Record class type. It’s in this case. A semi-colon separated list of all component names, i.e. min;max .

. One MethodHandle per component. This way the bootstrap method can create a MethodHandle based on the components for this particular method implementation.

The invokedynamic instruction passes all those arguments to the bootstrap method. Bootstrap method, in turn, returns an instance of ConstantCallSite . This ConstantCallSite is holding a reference to requested method implementation, e.g. toString .

Reflecting on Records

The java.lang.Class API has been retrofitted to support Records. For example, given a Class<?> , we can check whether it’s a Record or not using the new isRecord method:

Java







xxxxxxxxxx 1



1 jshell > var r = new Range ( 0 , 42 ) 2 r ==> Range [ min = 0 , max = 42 ] 3 4 jshell > r . getClass (). isRecord () 5 $5 ==> true





It obviously returns false for non-record types:

Java







xxxxxxxxxx 1



1 jshell > "Not a record" . getClass (). isRecord () 2 $6 ==> false





There is, also, a getRecordComponents method that returns an array of RecordComponent in the same order they defined in the original record. Each java.lang.reflect.RecordComponent is representing a record component or variable of the current record type. For example, the RecordComponent.getName returns the component name:

Java







xxxxxxxxxx 1 11



1 jshell > public record User ( long id , String username , String fullName ) {} 2 | created record User 3 4 jshell > var me = new User ( 1L , "alidg" , "Ali Dehghani" ) 5 me ==> User [ id = 1 , username = alidg , fullName = Ali Dehghani ] 6 7 jshell > Stream . of ( me . getClass (). getRecordComponents ()). map ( RecordComponent :: getName ). 8 ... > forEach ( System . out :: println ) 9 id 10 username 11 fullName





In the same way the getType method returns the type-token for each component:

Java







xxxxxxxxxx 1



1 jshell > Stream . of ( me . getClass (). getRecordComponents ()). map ( RecordComponent :: getType ). 2 ... > forEach ( System . out :: println ) 3 long 4 class java . lang . String 5 class java . lang . String





It’s even possible to get a handle to accessor methods via getAccessor :

Java







xxxxxxxxxx 1



1 jshell > var nameComponent = me . getClass (). getRecordComponents ()[ 2 ]. getAccessor () 2 nameComponent ==> public java . lang . String User . fullName () 3 4 jshell > nameComponent . setAccessible ( true ) 5 6 jshell > nameComponent . invoke ( me ) 7 $21 ==> "Ali Dehghani"





Annotating Records

Java allows you to annotate Records, as long as the annotation is applicable to a record or its members. Additionally, there would be a new annotation ElementType called RECORD_COMPONENT . Annotations with this target can only be used on record components:

Java







xxxxxxxxxx 1



1 ( ElementType . RECORD_COMPONENT ) 2 public @interface Param {}





Serialization

Any new Java feature without a nasty relationship with Serialization would be incomplete. This time around, however, the relationship does not sound as unappealing as we’re used to.

Although Records are not by default serializable, it’s possible to make them so just by implementing the java.io.Serializable marker interface.

Serializable records are serialized and deserialized differently than ordinary serializable objects. The updated javadoc for ObjectInputStream states that:

The serialized form of a record object is a sequence of values derived from the record components .

. The process by which record objects are serialized or externalized cannot be customized ; any class-specific writeObject , readObject , readObjectNoData , writeExternal , and readExternal methods defined by record classes are ignored during serialization and deserialization.

; any class-specific , , , , and methods defined by record classes during serialization and deserialization. The serialVersionUID of a record class is 0L unless explicitly declared.

Conclusion

Java Records are going to provide a new way to encapsulate data holders. Even though currently they’re limited in terms of functionality (compared to what Kotlin or Scala are offering), the implementation is solid.

The first preview of Records would be available in March 2020. In this article, we’ve used the openjdk 14-ea 2020-03-17 build, since the Java 14 is yet to be released!

Further Reading

A First Look at Records in Java 14

Potential Traps in Kotlin Data Classes

Parsing XML Into Scala Case Classes Using Xtract



