Zinc (sbt) friendly code

How to make you code compile faster with Zinc

In April I gave a presentation (50 Shades of Scala Compiler) at the ScalaUA conference. In the abstract, I promised to give some hints on how to write compiler-friendly code. The talk lasted only 40 minutes and Scala compilers have many, many shades (I almost forgot about Dotty!), so all the hints didn’t fit into the slides.

I don’t like false promises, so you can find first batch of hints (related to incremental compilers) below.

TL;DR;

If tools and their internals are boring, here is a condensed list of my advice. All points are explained below.

Don’t use names if you don’t need them

Zinc (by default since sbt version 0.13) uses a name hashing algorithm.

Let’s consider the following code:

If Bar.scala depends on foo from Foo.scala , then changing bar in Foo.scala will not make Bar.scala be recompiled (even if Bar uses Foo ).

How does Zinc know this?

It is all about names and hashes. Zinc computes a hash for each name defined in your class. The hash is generated from types of all members with a given name such as methods, values, types, nested classes etc. Zinc also keeps track of all names used in your class (names of methods, values, classes, types and much more). To decide if a given class needs to be recompiled, we only need to check if any of the names used have changed since the last compilation.

This makes Zinc much faster in terms of files recompiled for a given change (less is recompiled, therefore compilation is faster), but also introduces new problems.

Generally, the following sections are focused on reducing the number of names used in order to speed up incremental compilations.

Less is more

Less classes/traits/objects per source file means more time saved. Scalac can compile nothing less than a whole source. Even if Zinc knows that only a one-line object needs to be recompiled, it still has to compile the whole source (and all implicits macros and other nasty stuff inside).

The solution is as simple as possible: split your sources! If incremental compilation is not enough to convince you, you should be aware that it should also help with compilation time or even result in less conflicts during merges.

Types, we need more types!

If you don’t provide an explicit type, Scalac will infer one for you that is as precise as possible. Developers love this, but later on complain that compilations (even incremental ones) are long. What is the problem with methods without explicit types? The precise type generated by Scalac changes (sometimes often), even when we don’t want it to. Need an example? Here it is:

You may say that everything is fine as long as your code compiles. I agree if you don’t care how long compilation takes. Every time a type is changed, all usages need to be recompiled (and even more with an incremental compiler heuristic). Moreover providing explicit types in public members is generally considered as a good practice for making code more readable (especially without IDE support, e.g. during code review) and easy to maintain (especially in terms of binary compatibility).

In short, add return types to all public (or even all non-private) methods.

Do you want to improve compilation time? Adding return types to public members decreased compilation time in the Intellij Scala plugin by 17% (pull request).

Imports, wildcards and other nastiness

Most of us don’t care about imports. We only need them for code to compile. Even Intellij collapses imports by default:

Let me show that if you care about your build performance (leaving maintenance concerns for another blog post), correct handling of imports is crucial.

Unused imports

You may reason that unused imports affect Scalac performance (e.g. more implicits to check). Despite Scalac performance, it really affects incremental compilation performance. Even if a given import is not used at this moment, once changed (the thing behind the import, e.g. an object), it may affect the current file (e.g. providing a new implicit that will be used instead of a lower priority one).

Luckily, removing unused imports is quite easy. The simplest solution is to add an -Ywarn-unused-import flag to the compiler and -Yfatal-warnings on pull request validation builds. You will then be notified for every build that has unused imports. These warnings become an error on a CI build, which forces you to clean them up before the pull request is merged.

Why not set -Yfatal-warnings for all builds? Being forced to comment out imports, when you just want to check if e.g. using Vector will make your code faster, is really annoying in the longer term.

Of course there are tools that can manage imports for you: scalafix or your IDEs (IntelliJ, ScalaIDE and ensime)

Wildcards

As a tooling developer I should say: don’t use wildcards at all!. I can’t (and don’t want to) since most fantastic scala libraries start with import fantastic.lib._ .

Why are wildcard imports so hard for incremental compilers?

The first step to understand this evil is a simple test. Replace any wildcard import in your code with all members of that package. Later compile the code with ‘-Ywarn-unused-import’. I wish there were a compiler flag that would tell you how many imports from given wildcard are used :)

This is only the tip of the iceberg. With wildcard imports, an incremental compiler needs to become an ahead-of-time one. Why? Because you not only import all names that exist at the moment of compilation, but you also import all that will exist in future in that package. For an incremental compiler, the old python joke import jetboard from the.future has a new, bitter taste.

What can we do then? There is no silver bullet here but I can give you some hints:

Make you packages (or generally import scopes) small. If your package has 5 sources instead of 20, it will make your code much, much easier to learn. Make more use of the private or private[your_package] keywords. I really wish everything in Scala were private by default (maybe someone should create a tool for that?). Use wildcard imports only from libraries (or pieces of code that are changed rarely). Wildcards are dangerous only when imported things change.

Implicits

Implicits are probably the hardest part of Scala from the perspective of an incremental compiler. To support implicits effectively and correctly, we would need to add relations to all potential members of implicit scope. How many connections is that? Way too many to be effectively handled.

This is why Zinc has to use a simpler approach for handling implicits. In short, if you use anything from class Foo and any implicit name from it changes, then our source is recompiled.

What does this mean? If you want smooth and incremental compilation, don’t mix implicits and normal code in a single class. Unless you are fine with tons of sources recompiled for no reason!

Macros

The only way macros can be supported in an incremental compiler is naive brute-force: if a macro is recompiled then all places where this macro is used are also recompiled (however Scala Center is working on 3rd option). This sometimes means that macros are responsible for some of the longest incremental compilations.

How can we live with that?

Either take incremental compilation of macros in your own hands and turn the recompileOnMacroDef flag off (in sbt/zinc), or try to remove all cases in which macros are recompiled. How to do so? First of all, place macros in dedicated files (containing only macro definitions) and clean up imports. Generally reduce things imported and used in sources containing macros.

Kudos and further reading

Many, many thanks for VirtusLab Team and Jorge Vicente Cantero for all feedback and excellent advices.

If you find incremental compilation interesting you can learn more from links below.