While there aren’t any outright instantiations visible ( <clinit> calls), we can indeed see that the tDecoder is invoked, along with several macro methods.

This is, of course, expected. The decoding methods are there to be used after all. So, that avenue of investigation is a bust — apart from knowing that “our” decoders are indeed invoked.

Let’s turn to the bytecode of Transcoding (which is quite substantial, so I’ll spare you the overview).

When searching for references to Text , immediately, we find this:

This is a synthetic method accepting an Object (1), casting it into Text (2), and retrieving the value of the field (3) in order to initialize Message (4).

The method, in turn, is called here:

The previous method is being invoked at (1). Note one of the final lines, defining the local variable c . We’re now looking at the bytecode of the anonymous function in Decoder.instance for Message , i.e.:

Apparently something here is coercing instantiation.

Attempting a workaround

NOTE: the following section engages in heavily exploratory analysis and is non-essential for the understanding of the gist of this post — feel free to skip to “It gets worse!” if strapped for time. Otherwise, read on.

Tweaking the decoder

On a whim, let’s try to omit the unwrapping decoder for Text :

…​and:

no Text ! OK, so it’s possible this has to do with the decoder for Text ? Let’s test it by itself:

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

What about just invoking the decoder?

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

Same thing!

Either will not save you

As a sanity check, let’s simulate our deserialization directly, without an encoder:

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

Evidently even creating an Either will cause text to instantiate! We can see it in the bytecode here:

The telltale NEW / DUP /load-arg/ INVOKESPECIAL constructor sequence appears at the very end.

…​and neither will the decoder

It’s also possible that the Decoder itself can coerce instantiation.

Here’s the relevant verification code:

The above may seem intimidating but it’s really not that complex:

The decoder works by processing case classes through a shapeless Generic .

* A Generic is something that converts to and from a type (parameter A , in our case Text ), and its HList representation (the Lazy is there to prevent implicit resolution errors in some scenarios, and has otherwise no bearing on how the decoder is generated).

* An Hlist is a typed list representing (simplified) the fields of the type. In our case, we just have one field, which means the representation is String :: HNil ( HNil being the same as Nil for "normal" lists).

* in our case, A is Text and R is String . if the required Generic is found (provided both for case classes and value classes out of the box by shapeless )…​ the “raw” value of the relevant JSON (in our case, the String) is parsed in, used to create an HList of unwrapped :: HNil …​ …​and converted via Generic to the target type, i.e. our Text in this case.

A possible emulation of this process would be:

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

Yup, still on our way to minimal problem reduction.

If you’re getting confused by now (“do AnyVals actually work?”), rest assured that e.g. the following:

import domain.Message.Text object MainSanityCheckForRaw extends App { val text = Text("blah") System.in.read()

println(text.value)

}

does not produce any instances of Text . Everything gets translated into static calls.

More workarounds?

“OK” — you now say — “you’ve manually created a decoder for Message itself”:

“What happens if you go in the other direction, and replace it with a semi-auto one?”

Running MainDecode again:

domain.Message 16 16 (0.0%) 1 (0.0%)

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

Same story, probably for the same reasons.

It gets worse!

By now, you may have noticed that the constant element is not any API that we’re using, but something related to the value classes being “packed” into a parametrized type.

We need to confirm the hypothesis by avoiding circe , shapeless etc. completely.

Implicit resolution

First, we’ll create a custom implicit resolution hierarchy. Let’s say that we have a type class for calculating the length of the string representation of a given class, called StringMeasure :

We can now see what happens if we implement an instance for Text , resolving the "base" case via implicits:

You’ll probably not be very surprised that we get:

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

So, it turns out that any kind of parametrized implicit resolution, even when it targets only the value class in question, causes instantiation.

Simply put, anything that uses implicit resolution to automatically derive generic functionality for its domain will cause value classes to be instantiated.

But do we really need implicit resolution to trigger our now-favorite edge condition?

Culprit — type parameters

As a final example, let’s see what happens when we pass our value class through a simple, parametrized method:

domain.Message$Text 16 16 (0.0%) 1 (0.0%)

And, to be clear, if we switch getThing to def getThing(text: Text): Text = text , the value class doesn’t instantiate.

Bonus — does specialization help?

MainSanityCheckForDef$Number 16 16 (0.0%) 1 (0.0%)

Nope. So no hope for value classes containing primitives.

Finally — the “WHY”

So we see now that the problem is indeed parametrization. It causes instantiation in implicit resolution of circe , in creating a shapeless Generic , etc. etc. etc.

But why?

Well, unfortunately the exact implementation of value classes doesn’t appear to be a subject of the Scala Language Specification (only some minimal conditions are mentioned). So, again, we turn to the documentation referenced at the start. As a reminder:

A value class is actually instantiated when: 1. a value class is treated as another type. 2. a value class is assigned to an array. 3. doing runtime type tests, such as pattern matching.

The most likely condition here appears to be the first — the type parameter counts as “another type” (there’s additional proof of this in some specs for Dotty; we’ll get to that at the conclusion of the post).

Unfortunately, the SIP fails to provide relevant examples, so I’m not 100% certain what actually triggers the conditions.

My guess boils down to the following: due to type erasure on the JVM, when generating bytecode for type parameters, you have two options:

either you create polymorphic duplicates of all relevant methods for your types (as in specialization),

or you must pass everything as java.lang.Object .

Since we’re not doing the former, the latter must happen — so our value class therefore gets instantiated.

What now?

You may be wondering if it’s time to panic now. And the answer is, of course — not really.

If you’re running this kind of domain encoding and not seeing any performance problems, you should be absolutely fine for now.

Instantiation is quite cheap on modern VMs, the biggest problem being potentially longer GC pauses on traffic spikes.

Doublechecking

Of course, if you have potential performance bottlenecks, it would pay off to check your code. Unfortunately, I can’t offer you anything other than old-fashioned, manual heap analysis via jvisualvm or other tools.

Theoretically, it would be possible to create a plugin that checks whether value classes are instantiated during tests, probably using sbt-jmh or similar as a base. Apart from actually coaxing JMH or another benchmark/profiler to provide the right information, you probably would also need to devise some heuristic to detect value classes via structure, as they appear to be identical to normal classes otherwise - the only difference are the caller patterns and additional static methods.

Reducing noise

For existing projects — especially if your value classes have little or no methods — consider removing the AnyVal qualifier, and check whether there’s any performance hit.

After all, you’re not fooling the compiler. Why fool yourself? Or that poor maintainer that inherits the codebase after you (and may be you)?

For new projects

If you are, however, creating a new project, consider that, again:

Anything that uses parametrized resolution, like:

automatic generation of JSON encoders/decoders,

any sort of generic converters,

indeed anything that generates ADT hierarchies, with implicits or otherwise,

will instantiate your value classes (barring some very specific compiler optimization like e.g. here).

This is something to write in the column of value class viability for domain modelling.

Again, the remark of not actually using the AnyVal qualifier in the previous section? That also applies whenever you’re using them solely as DTOs/domain objects (with no member methods etc.). You don’t want to foster cargo-cult programming, do you?

As I wrote at the very start, value classes comprise only one of the “slightly-better-typed” current standards for modelling, the other being tagged types. These are also not ideal (e.g. if the tag implementation is covariant, you can get away with inserting raw types in the relevant fields), but they also have their advantages. A good primer on using tagged types was written by Marcin Rzeźnicki and can be found on the Iterators blog here.

Dotty FTW

Finally, to end on a brighter note, Dotty is considering the introduction of Opaque Types. This appears to be intended as a replacement of value classes.

One of the stated main goals of opaque types is indeed avoiding the instantiation issues that plague value classes. In fact, this section of the SIP enumerates virtually identical instantiation cases to the ones we discussed here!

The SIP is still pending, so let’s keep our fingers crossed that it will make it to Scala 3 or later. In the meantime, leaving you with new information on domain class modelling in Scala provided in this post, happy coding!