As a C++ project grows and matures, the following line is inevitably spoken: “The build is too slow”. It doesn’t really matter how long the build actually takes; it is just taking longer than it was. Things like this are an inevitability as the project grows in size and scope.

In this post I’ll talk specifically about my recent use of forward declarations to vastly improve build times on one of those projects, and how you can too.



What are forward declarations?

A forward declaration in C++ is when you declare something before its implementation. For example:

class Foo; // a forward declaration for class Foo // ... class Foo{ // the actual declaration for class Foo int member_one; // ... };

You can forward declare more than just a class, but in this article I’m only referring to class forward declarations.

When you forward declare a class, the class type is considered to be “incomplete” (the compiler knows about the name, but nothing else). You cannot do much with an incomplete type besides use pointers (or references) to that type, but pointers are all that we will need. (More on that in a bit.)

How do forward declarations help the build time?

When the compiler is creating your class, it doesn’t actually care about very much. Its goal is ultimately to determine the class’ layout in memory, and to do that, it needs to know the size of your class’ data members. For example:

struct Foo{ int a; int b; };

Our class Foo has two integer members. When the compiler creates a layout for this class, it will approximately allocate sizeof(int) + sizeof(int) contiguous space for it. (Padding and custom-alignment directives notwithstanding).

When Foo has a dependency on Bar , then the compiler needs to know the size of Bar as it compiles Foo :

struct Bar{ int a; }; struct Foo{ int a; int b; Bar c; };

In the code above, when the compiler reaches Foo , it already knows what the size and alignment of Bar is. (“Alignment” is a property of a class that dictates how much space the compiler will allocate for it. A thorough discussion is outside the scope of this article, but The Lost Art of C Structure Packing gives it a good treatment).

If we reversed the order like so:

struct Foo{ int a; int b; Bar c; }; struct Bar{ int a; };

We would likely end up with a compiler error because the compiler cannot possibly determine a layout for Foo without first knowing the layout for Bar . If Bar was in its own header file, we would need to include it in Foo’s header file:

#include "Bar.h" struct Foo{ int a; int b; Bar c; };

So now Foo.h has a dependency on Bar.h.

And what if we complicate Bar to have another member, Baz ?

#include "Baz.h" struct Bar{ int a; Baz b; };

Now Bar.h depends on Baz.h. Foo.h directly depends on Bar.h, and indirectly on Baz.h. You can see the beginnings of a “dependency graph” forming here. As your codebase grows, you can imagine how large these dependency graphs might get.

Why is this a bad thing? The C++ compiler takes a simplistic approach to handling these dependency graphs — during the “pre-processing” stage of compilation it just copy-pastes one header into another, collapsing the graph into one gargantuan source file. Just check the documentation for what “#include” actually does!

The Foo class might not actually care at all about the Baz class; the Bar class (which has a Baz member) may only use it internally. So at compilation time, Foo is paying for the compiler to to parse something it doesn’t even care about! This violates one of the core tenets of C++: “Only pay for what you use”.

What’s worse, is if Baz.h changes, then the compiler must recompile Foo ! Not only does it take longer to compile Foo , but we must also compile more often. Good grief.

Dependency breaking

We’ve decided that we don’t like how Foo.h depends on Baz.h through Bar.h, so we decide to solve the problem with a little forward declaration. If Bar forward-declares Baz , and then uses a pointer to Baz , then the compiler no longer needs to know anything about the size and layout of Baz when creating a layout for Bar :

//Bar.h class Baz; struct Bar{ int a; Baz* b; }; // Bar.cpp #include "Bar.h" #include "Baz.h" // ... (use our pointer to Baz)

This works because, from a size perspective, all pointers are exactly the same. That means we don’t need to know the full definition of Baz until we try to access one of its members. Bar still depends on Baz , since the translation unit is per .cpp file, but the dependency is left out of Bar.h.

This causes something interesting to happen to Foo.h:

// Foo.h #include "Bar.h" struct Foo{ int a; int b; Bar c; }; // Foo.cpp #inlude "Foo.h" // ...

Nowhere in the included files for Foo.h will we find Baz.h. This means that:

if Baz.h changes, only Bar.cpp will recompile

the preprocessed source file for Foo will not include the contents of Baz.h

Now that there’s less work for the preprocessor and compiler to do for Foo , it goes faster. It takes up less memory. It needs to be rebuilt less often! With forward declarations we’ve improved both full rebuilds and incremental rebuilds.

What are the downsides to foward declarations?

The Google style guide recommends against using forward declarations, and for good reasons:

If someone forward declares something from namespace std, then your code exhibits undefined behavior (but will likely work).

Forward declarations can easily become redundant when an API is changed such that it’s unavoidable to know the full size and alignment of a dependent type. You may end up with both a #include and a forward declaration in your header in this case.

There are some rare cases where your code may behave differently. You may not be bringing in additional function overloads or template specializations that you previously relied upon, or you lose inheritance information which can cause a different overload or specialization to be called in the first place.

Also, notice that when we transformed a class member to a pointer, we likely had to start dealing with heap-allocated memory for each instance of Foo . If Foo needs to access its Bar pointer very often, the small overhead of a pointer indirection can add up. It’s also not very cache friendly; members of Foo are not together in memory, which could cause a cache-miss when trying to access a member of Bar at runtime (very expensive).

Like any other technique, forward declarations must be used carefully.

The Modules TS may present another safer alternative to improving build times by removing the need for the preprocessor to paste in entire headers over and over again for different translation units.

Real world results with forward declarations

I was recently tasked with the onerous job of “improve the build” for a mid-sized code base.

How did I decide to start with forward declarations?

For starters, forward declarations are low-hanging fruit as far as improving build time goes. It’s much easier to routinely go through the code adding forward declarations than it is to change an interface, pull files out into new libraries, or build faςades. In a short, they are a lazy programmer’s best friend.

Also, having had some experience with the code base, I knew that the code had:

many classes that only used pointers to our types

many unnecessary includes (for historical reasons, laziness, or naiveté…)

plenty of automated unit tests to ensure I didn’t accidentally break something

I also had a bit of a hint that our header dependencies were a little bloated when I found that Visual Studio’s built-in dependency graph generator consistently crashed when I tried to run it on our code base. Still, I didn’t have any real proof that forward declarations would actually improve anything at all. Just an intuition. So we decided to be Agile about it.

I took the top 5-10 headers that were most often included, and I made it a challenge to replace them with forward declarations wherever I could. If doing this improved things at all, then I could go ahead and take a more comprehensive approach.

What we found was that we could do a full rebuild of our C++ 10% faster! Along the way I gained even more confidence that a more comprehensive approach would yield additional gains.

What you’ll discover after your initial go at things

Before I continue, here are some of the pain points you’ll discover when you

want to set about replacing your headers with forward declarations:

other random files will start breaking from missing includes (that they

used to indirectly have).

This is frustrating, but on a positive note, it forces your codebase to follow a best practice — a translation unit should be self-contained. Never should you rely on an indirect include because future refactoring efforts will needlessly break your code and cause headaches for other programmers.

Because you will likely have to fix unrelated code…

work like this ends up touching way more files than you initially thought.

This is a bit of a nightmare for your code reviewers. Everyone groans when they see hundreds of changed files in a code review. The solution to this is to communicate to your reviewers what’s going on ahead of time; they don’t need to look at every single file in the review. Perhaps a random sampling, or just some of the more important headers.

Attacking even more code

After I showed the team our 10% speedup on rebuild, they were as hungry as I was to see more. I got the go-ahead to spend a week touching as many headers as I could for forward declaration work. At the end of it all, I had gone though perhaps two-thirds of all header files. The result? An additional 30% faster compile time for a total of 40% faster C++ compile times! (Plus Visual Studio stopped crashing when generating the dependency graph).

This result was quite surprising, I had thought we would already start to see diminishing returns after the first go. I must admit that I didn’t have the luxury to be purely scientific; I was also removing unnecessary headers along the way, but I will assert that the work was predominately forward declarations.

Conclusions

Best practices have their flip-sides. The Google style guide (and a few of my coworkers) made some good points against the usage of forward declarations, but the real world results of the technique are undeniable. All the developers are happier; the build => test => run cycle is faster for them. The automated builds are faster. The compiler’s memory usage is down.

The point about memory usage becomes more important for large parallel builds. In fact, we were occasionally running out of memory, and this work has abated those issues (for the time being; forward declarations are really just a band-aid on an architectural issue).

Time is money, and in a larger project with many well-paid people, saving even a small amount of time has an economy-of-scale effect; provably thousands or hundreds of thousands of dollars saved in development time.