Bazel is the open source variant of Google’s internal Blaze build system. It aims to provide fast and deterministic build/test, scaling to extremely large code bases with billions of lines of code, for example Google’s monorepo. Beyond the basics of performance and correctness, some examples of the advanced features that Bazel brings to the table include:

Build and test sandboxing. By taking advantage of Linux namespaces to isolate build and test execution ( sandbox-exec on Mac OS), the build system can ensure that there are no missing dependencies. This is also known as build/test hermeticity.

on Mac OS), the build system can ensure that there are no missing dependencies. This is also known as build/test hermeticity. Remote caching and build/test execution.

A safe, parallelizable, Python-like language for extensions (Skylark).

Parallel execution of many instances of the same test for deflaking.

IDE and tooling integrations, for example aspects and the event protocol.

As a Googler, I’ve experienced first hand the benefits of working with Blaze and find the internal build experience to be close to magic; it works seamlessly and allows the developer to focus on their core mission of developing great software, mostly without the hassles and complexity that I’ve encountered in other build systems such as GNU Make, SCons, Xcode, CMake and autoconf/automake.

In the OSS Envoy project, we on-boarded Bazel in the Spring of 2017, converting the project to Bazel from its extant CMake based build system. There were a number of motivating factors behind this shift, including providing a scalable build/test environment that would grow with the project. Embracing Bazel also had the added benefit of providing better integration with Google’s internal build infrastructure.

A key issue in Bazel, that does not exist in Blaze, is the management of external dependencies. A monorepo by definition provides complete encapsulation of all dependencies in a single repository and build system. Third party dependencies must be converted to Blaze in order to build. In the OSS world, projects typically depend on other project repositories, together with their build artifacts, via a combination of implicit and explicit external dependencies. These are often captured in requirements manifests, git submodules, setup scripts, README.md instructions, vendoring, forking, Docker images, etc. A project written in Bazel might depend on 5 other projects, which in turns might each depend on a number of other projects, each using a completely different non-Bazel canonical build system.

While converting Envoy to Bazel, we evaluated and implemented a number of strategies for managing our external dependencies. These dependencies were mostly C and C++ and had a mixture of upstream supported build system types (autoconf/automake, CMake, shell scripts, Bazel). While there is official Bazel documentation for managing external dependencies, it is language independent and does not deal in depth with the specific challenges that exist for C/C++ dependency management. It also does not provide comprehensive guidance for the common situation in which dependencies do not have a native Bazel build system. We hope that the experiences documented below provide for a useful account of how a medium sized C++ OSS project can deal in practice with the external dependency management problem in Bazel.

Envoy’s external dependencies

Envoy’s many and varied external dependencies below are shown, together with their upstream build systems:

The first thing to notice about this picture is that it’s actually quite confusing, with the many varied build systems of our external dependencies and the fact that some projects have multiple supported upstream build systems.

For dependencies that have Bazel native upstream support, there is standard Bazel support for inclusion in the Envoy build, e.g. via the git_repository rule, see also depending on other Bazel projects in the official docs. A target //:bar in an external repository:

git_repository(

name = "foo",

remote = "https://github.com/something/foo.git",

commit = "4374be38e9a75ff5957c3922adb155d32086fe14",

)

can be referred to as @foo//:bar inside Envoy targets. If all our external dependencies had native Bazel support, this article would be short and end here.

Unfortunately, the reality is that we depend on other projects that have a mixture of build systems. Below we discuss a number of approaches to integrating non-Bazel external dependencies. Our fitness function when evaluating each strategy is the following:

We need to be able to support scaling the Envoy build to new external dependencies without requiring Envoy developers to become Bazel experts. That is, avoid gating new external dependencies on the few Bazel experts that exist in the Envoy developer community.

We need a maintainable solution that requires minimal touch after initial introduction of an external dependency, even when we bump the version of an external dep.

We need to be able to trust the build artifacts, which includes the ability to execute tests on an external dependency build artifact or know that the external dependency’s standard test suite has been run against the build inputs we use.

TL;DR: Envoy developers want to focus on the high value tasks they are engaged in, e.g. improving load balancing, not becoming build system experts. This is a general concern in OSS, where, unlike large corporate environments, there often does not exist a dedicated build team.

Adding a Bazel build system for the external dependency

The standard recommendation for non-Bazel dependencies is to rewrite the dependency’s build system in Bazel, i.e. supply one or more BUILD files for the dependency. This might range from a simple exercise (see the Bazel trivials in the diagram above) to a complete overhaul of the build system for larger dependencies. It also implies not just supplying BUILD files for direct dependencies but all the non-Bazel dependencies in the transitive dependency set.

There are two ways to introduce Bazel rules for an external dependency:

Upstreaming support for the dependency, i.e. placing BUILD files in the upstream project’s repository. This may be successful for supportive projects, but might also not be practical. For example, the upstream project might want to only have a single supported build system that remains non-Bazel, e.g. CMake. Hence, it might not agree to maintaining the Bazel BUILD files or to gating its CI on their correctness. While Bazel is awesome sauce to its community base, it is a fringe build system today for C++, a perception that will likely evolve over the coming years. Injecting BUILD files into an external dependency during import, using rules such as new_git_repository. This strategy relies on Envoy developers writing and maintaining a parallel Bazel build system for all the transitive dependencies where missing.

Writing a BUILD file often involves squinting at what the existing non-Bazel build system does to learn how it constructs flags to invoke the compiler, which files need to be copied into deliverables, where the dependencies live, etc. This isn’t hard for a header-only library, but would not scale to a dependency the size of Chromium, as an extreme example.

An additional concern with maintaining a non-canonical build is that it’s hard to know if the resulting artifacts are correct, in particular as the canonical build system changes. Bazel’s requirements of test hermeticity are much stricter than most other build systems, so porting not only the build but also test targets can become a significant engineering challenge; this took us ~1 month for Envoy alone. Testing can be potentially super important in practice, as even slight changes to the build flags can result in surprise bugs. A great example is a Protobuf issue that we discovered when mismatched flags were present in the build.

Google in fact opts for something close to solution (2) above when importing into its monorepo. Anyone involved in introducing a third party dependency to the monorepo must become intimately familiar with the canonical build system of the project and Blaze to be able to make this work.

For OSS Envoy, we made a pragmatic tradeoff. We did not want to become Bazel maintainers for all our external dependencies and didn’t want to block new features on putting together a Bazel build system for the respective external deps. Those dependencies that could be trivially converted to Bazel, e.g. a header only library such as rapidjson, have BUILD files maintained in the Envoy repository and injected via new_git_repository rules. We required a better approach for more complex dependencies, such as c-ares or LuaJIT.

Wrapping the canonical build system in Bazel genrules

Bazel provides genrules for describing an arbitrary shell action. With some work, the build for an external dependency can be wrapped with a genrule . A simplified example is:

FOO_GENRULE_BUILD = """

genrule(

name = "bar_genrule",

srcs = glob(["src/**.cc", "src/**.h"]),

outs = ["bar.a", "bar_0.h", "bar_1.h", ..., "bar_N.h"],

cmd = "./configure && make",

) cc_library(

name = "bar",

srcs = ["bar.a"],

hdrs = ["bar_0.h", "bar_1.h", ..., "bar_N.h"],

)

""" new_git_repository(

name = "foo",

remote = "https://github.com/something/foo.git",

commit = "4374be38e9a75ff5957c3922adb155d32086fe14",

build_file_content = FOO_GENRULE_BUILD,

)

It’s possible to write a Skylark framework for this to reduce boilerplate and allow the Envoy developer the ability to specify a shell script that encapsulates the entirety of the build. A complete early example of this for Envoy is provided in #716.

This works well and can even be hermetic, providing all the dependencies that are needed to invoke the underlying CMake or autoconf/automake build can be captured in Bazel targets as inputs. We know we are using a particular version of a dependency and the exact same build setup that is used upstream in its CI for that version, so we don’t need to build and re-execute tests.

In practice, this approach works extremely well for an OSS developer flow, since it decouples the Bazel hackery from the work needed to describe a new dependency, which is typically as simple as writing a shell script. This provides abstraction for developers from Bazel and allows developers to focus on their goals rather than the build system’s.

There are some drawbacks with the genrule approach. The most significant stem from the fact that now we have two or more build systems in play, and propagating the required build and configure flags needed for cross-compilation is extra work. Similarly, flags for build configurations such as TSAN require extra plumbing for the non-Bazel-built dependencies, since Bazel does not automatically set CFLAGS or CXXFLAGS to match its internal copts . Under pure Bazel builds, adding an additional compiler or linker option globally across the build is much simpler.

Another notable build performance concern is that there are now multiple independently coordinating job servers. Any build system will be responsible for scheduling the execution of various jobs such as compiler or linker invocations. The aim is to optimize concurrency while avoiding overwhelming system resources with the number of simultaneous jobs. With genrule wrapping, we now have Bazel’s job scheduler and N make job servers if we have N external make dependencies. In #716, we avoided the problem of N make job servers by placing all N external dependencies under a single top-level genrule invocation of make and then recursively invoking the build system of each external dependency from the top-level make . This allowed a single top-level make job server to coordinate all external dependencies, reducing the number of job servers to two (Bazel and top-level make ). Even so, two job servers is one too many for optimal resource scheduling.

Wrapping the canonical build system in Bazel repository rules

A drawback of the genrule approach is that all output artifacts, including header files, need to be explicitly enumerated. This becomes a pain point when this list becomes large and has churn between versions. To mitigate this, in #716, we provided an offline tool to automatically determine the generated files for a dependency and then create respective BUILD files for each external dependency. The generator tool would then be run as a separate step each time an external dependency was modified.

An optimization on this genrule wrapper technique is to use Bazel repository_rules instead of genrule . These allows for the build of the external dependencies to happen during the Bazel loading phase. A Bazel build has three phases, understanding this turns out to be useful below:

Loading; this is where the BUILD and .bzl files are loaded. Remote repositories, e.g. as specified in git_repository are fetched in this phase. At the end of this phase, the source filesystem state becomes the immutable input to any rules or actions in following phases. Analysis; this is where the rules, e.g. genrule , are evaluated, resulting in a set of actions that relate via an action graph. Execution; actions are executed as described by the action graph. This is when Bazel will schedule and execute compilation, link and test jobs.

By building and copying files in the loading phase, there is no need to explicitly enumerate outputs from a repository rule, as they are on the filesystem by time a cc_library target with globbed inputs is analyzed.

FOO_BUILD_SH="""

#!/bin/bash

cd foo

git checkout 4374be38e9a75ff5957c3922adb155d32086fe14

./configure

make

""" git clone https://github.com/something/foo.git cd foogit checkout 4374be38e9a75ff5957c3922adb155d32086fe14./configuremake""" FOO_BUILD="""

cc_library(

name = "bar",

srcs = ["bar.a"],

hdrs = glob(["*.h"]),

)

""" def _foo_repository_impl(ctxt):

ctxt.file("foo_build.sh", content=FOO_BUILD_SH)

ctxt.file("BUILD", content=FOO_BUILD, executable=False)

ctxt.execute(["./foo_build.sh"]) foo_repository = repository_rule(

implementation = _foo_repository_impl,

environ = ["CC", "CXX", "LD_LIBRARY_PATH"],

local = True,

...

) foo_repository(

name = "foo",

)

Repository rules (and genrule s) can be made non-hermetic with local = True , trading off some of the Bazel wins from hermeticity for simplicity of dependency management. For example, if we require some_funky_dvcs during the build process, it does not have to be a Bazel dependency, we just ask users to have it installed prior to the build. A non-hermetic network fetch can be performed with some_funky_dvcs . This then makes it hard to harness advanced Bazel features such as remote execution or caching during the build of the external deps. This is not a first class concern for almost all Envoy developers today, however. Non-hermetic repository rules are how the Bazel C++ toolchain is discovered on the local system under-the-hood, where by necessity non-hermetic actions that peek into the local environment are required.

A property of the repository_rule approach is that repository rules execute sequentially. Hence, only one job server at a time runs; the recursive make runs during Bazel’s loading phase, prior to Bazel’s parallel execution of its native build during the execution phase. This is a two edged sword; while providing tighter control over job scheduling across cores, it can also limit parallelism, since there can be no overlap between the build of the external dependencies and build of Envoy’s Bazel native targets.

We adopted the repository_rule based approach in #747. We were initially perhaps over enthusiastic in our application of this technique, since we moved all external dependencies to build under this rule. Unfortunately, this reduced the amount of build parallelism and ignored the native Bazel support in some of our external dependencies, such as Protobuf, because we didn’t have an effective way to plumb native Bazel dependency artifacts into the non-Bazel build rules. This was fundamentally due to the native Bazel dependency artifacts not being available during the loading phase, when the repository rule is executed.

In #1682, thanks to some really great improvements provided by John Millikin at Stripe, we started to decouple from the single repository_rule recursive make wrapper some of those dependencies that could benefit, and have continued to refine our application of native build system wrapping via genrule and repository_rule techniques. Below, we refer to both these techniques simply as wrapping.

Prebuilts

When converting to Bazel, a hard requirement from Lyft and other consumers of Envoy was to preserve the capability of Envoy to consume all external dependencies as prebuilt libraries. There were two reasons behind this:

Earlier in the Envoy project, we used hosted Travis CI, where we were hitting the limits on build times. Prebuilding all external dependencies and placing them in a Docker image that could be consumed in CI runs was an effective way to reduce the CI execution time. Some companies prefer to use blessed versions of external dependencies and manage their own internal distributions of these components as binaries. This might be done for security reasons, e.g. to allow for zero day patching.

The above wrapping techniques fit this approach well, since the wrapping rules generate .a and .h files as intermediaries, that could then be consumed by cc_library targets. Using a prebuilt version as a replacement for the wrapped rule outputs was as simple as populating the local source filesystem with the prebuilt artifacts at the correct locations, causing Bazel to ignore the wrapping rule.

A pure Bazel build does not provide this prebuilt capability cleanly. It is possible to deal with (1) above by first running bazel fetch and then in a somewhat hacky way copy out the contents of the Bazel internal cache to a Docker image for later use. We do this today in our CI, but we are aware that we are making use of undocumented Bazel behavior. This does not satisfy the concerns in (2) however.

How do other languages and build systems manage the external dependency problem?

Managed languages with architecture/environment independent build artifacts like Java’s JAR files can easily consume non-Bazel artifacts as binaries, see for example the maven_jar rule. The analogous C++ scenario would be consuming prebuilt library binaries from some central repository. However, to support this, the central repository would requires library binaries for each OS, architecture and kernel configuration we would need to support. This is unlikely to be feasible in general.

An alternative approach to managing external dependencies with non-native Bazel builds would be to automatically translate the dependency’s build system to Bazel. There is precedent for this kind of transformation, for example Chromium’s Gyp -> Ninja translator. This is probably an under explored area in Bazel today. Bazel’s strict hermetic requirements for build and test will make general purpose translation from weaker build systems challenging.

What are the best practices for Bazel C++?

In a perfect Bazel world, all external dependencies would have upstream BUILD files. This will likely never be true in the non-monorepo OSS world, since build tooling diversity is inherent.

It’s clearly a no brainer to maintain a parallel Bazel build system for a dependency where the BUILD files are trivial, e.g. header only libraries. It’s conceivable that a contributed BUILD file repository such as https://github.com/bazelment/trunk might provide an opportunity for community maintenance of more significant BUILD files if broadly adopted.

Today with Bazel, a project such as Envoy needs to decide if it’s more important to enable developer velocity via wrapping the native build system, or whether the advantages of a pure Bazel build system outweigh this.

A pure Bazel build system can be quite compelling. Building the entire dependency tree with consistent flags simplifies adoption of sanitizers across the entire code base, cross-compilation and can even expose upstream bugs in dependencies with non-Bazel native builds upstream. It may simplify the inclusion of a project such as Envoy in other Bazel based projects as a dependency itself; this concern was certainly important to Envoy during earlier integration with Istio.

However, if the balance falls in favor of developer velocity, writing and maintaining the missing BUILD files for every external dependency in the transitive set will be a less reasonable approach. Wrapping or some hybrid approach becomes attractive.

Conclusion

The choice of C++ external dependency management technique is a decision that needs to be made on a per-project basis, taking into account the OSS project’s goals and developer flow. For example, if build correctness isn’t a huge concern, it might be reasonable to embrace an approach where BUILD files that do not include test capabilities are acceptable.

We have filed an issue with the Bazel project to take some of the above lessons learned about C++ external dependency management and standardize them in Bazel. Bazel’s ability to interoperate with prebuilts and external C++ components is an area that could stand to be improved.

Until the Bazel C++ community develops more experience and we have have zero futz off-the-shelf support for both pure Bazel external dependencies and wrapped native builds, it is clear that there is no single best practice here yet for Bazel and C++ external dependencies. We have embraced a pragmatic mixture of techniques in Envoy’s build system for external dependency management and it is likely that other similar projects will benefit from doing the same.

Update (2020–06–23)

A lot has changed in the Bazel world since the original publication of this article 2.5 years ago. The most notable from a C++ external dependency perspective is the introduction of rules_foreign_cc. These Bazel rules provide a very convenient and low futz way to consume both cmake and autoconf dependencies in a Bazel C++ project. Envoy now uses this approach for the bulk of its C++ dependencies, obviating the need for tricks like repository rules. We have had significant success with this, kudos to irengrig in the Bazel build team who worked on this. These Bazel rules should be considered the preferred option when integrating non-Bazel external dependencies, with the techniques described in the article above only used when this does not cover a use case.

Acknowledgements: The above article was informed by joint work on Envoy’s build system with mattklein123, John Millikin, Lizan Zhou, Piotr Sikora and the Envoy developer community. Many thanks for the insightful discussions along the way. Thanks also to Josh Marantz and Trevor Schroeder for helpful feedback on earlier drafts of this article.

Disclaimer: The opinions stated here are my own, not those of my company (Google).