This blog post is a slightly edited version of the live tran­script of the talk I gave at Rust­Fest 2019 in Barcelona on Novem­ber 10th, 2019. As it’s a tran­script some parts of it are a bit repet­i­tive or badly word­ed, but I hope the message behind the talk will be conveyed by this post anyway.

The orig­i­nal tran­script was provided by the Rust­Fest orga­niz­ers, and it’s released under the Creative Commons Attri­bu­t­ion-Share­ALike 4.0 Inter­na­tional License.

You can also down­load the slides or watch the record­ing of the talk.

Hi, every­one. In this talk I am going to shed a bit of light on how the Rust release process works and why we do it that way. As they said, I’m Pietro, a member of the Rust Release team and the co-lead of the Infrastruc­ture Team. I’m actu­ally a member of other teams: I do a lot of stuff in the project.

I think every­one is aware by now that we actu­ally got a release out a few days ago, with some features every­one awaited for, for a long time but that’s not the only release. Six weeks earlier we released 1.38 which was released in Septem­ber and changed a hundred thou­sands lines of code. Users just reported 5 regressions after the release came out. And only two of them broke valid code, the other ones were just perfor­mance regres­sions or worse error messages.

Six weeks earlier, there was another release, 1.37, and this changed tens of thou­sands of lines of code and we just got 3 regressions report­ed, and unfor­tu­nately all of them broke valid code, but it’s a very little number. Even before, we got 1.36 out in July with just 4 regressions report­ed. I wanted to explain a little bit why we do releases this fast, which creates a lot of prob­lems for us, and how can we prevent regres­sions, and just get very few reported after the release is out?

So why do we have this sched­ule? The ques­tion is inter­est­ing because it’s really unusual in the compiler world. I collected some stats on some of the most popu­lar languages. While there are some efforts to shorten the release cycles (Python recently announced that they are going to switch to a yearly sched­ule), Rust is the only compiler except for browsers that’s sort of popu­lar and has a six-week release cycle. In the compiler world that’s pretty fast, but there is a simple reason why we do that: we have no pres­sure to ship things.

If a feature is not ready, we have issues, we can just delay it by a few weeks, and nobody is going to care if it’s going to get stabilised today or in a month and a half. And we actu­ally do that a lot. The most obvi­ous exam­ple is a few weeks ago, when we decided that async/await wasn’t ready enough to be shipped into Rust 1.38, because it turns out it wasn’t actu­ally stabilised when the beta freeze happened and there were block­ing issues so we would have to rush the feature and back­port the stabi­liza­tion, and that’s some­thing that we would not love to do.

We actu­ally tried long release cycles, espe­cially with the edition, and it turns out they don’t work for us. The stable edition came out in early Decem­ber and in Septem­ber we still had ques­tions on how to make the module system work. We had a proposal in early Septem­ber which was not imple­mented yet, and that’s what actu­ally was released, but the proposal had no time to bake on night­ly, users didn’t have much time to test it. It broke a lot of our inter­nal processes.

We actu­ally did this thing which is some­thing I’m not comfort­able with still, which is we actu­ally landed a change in the behav­iour of the module system directly on the beta branch, two weeks before the stable release came out, and if we did a mistake there we would have no way to roll it back before the next edition, and we don’t even know if we are going to do a 2021 edition yet. This PR broke almost all the poli­cies we have, but we had to do it, other­wise we would not have been be able to ship a work­ing edition, and thank­fully it ended great.

The 2018 edition works, I’m not aware of any huge mistakes we made but if we actu­ally made them it would’ve been really bad because we would have to wait a lot to fix them and we would be stuck in the 2018 edition with a broken features set for back­ward compat­i­bil­ity reasons.

So with such fast release cycles, how can we actu­ally prevent regres­sions from reach­ing the stable chan­nel? Of course, the first answer is the compil­er’s test suite, because rustc has a lot of tests. We have thou­sands of them that test both the compiled output but also the error messages, and the tests run a lot: we have 60 CI builders that run for each PR taking three to four hours. So, we actu­ally do a lot of test­ing but that’s not enough because a test suite can’t really express every­thing that Rust language can do.

So we use the compiler to build the compiler itself: for every release we use the previ­ous one to build the compil­er. On nightly we used beta, on beta we used stable and on stable we used the previ­ous stable. That allows us to catch some corner cases, as the compiler code­base uses a lot of unsta­ble features and also it’s a bit old so there are a lot of quirks in it. But still, that can’t actu­ally catch everything.

We get bug reports from you all. We get them mostly from night­ly, not from beta, because people don’t actu­ally use beta. Asking our users to test beta is some­thing we can’t really do: with such a fast cycle you don’t have time to test every­thing manu­ally with the new compiler every six weeks. Languages with long release cycles can afford to say “Hey, test the new beta release”, but we can’t, and even when we ask, people don’t really do that.

So we had an idea. Why don’t we test our users’ code ourselves? This is an idea that seems really bad and seems to waste a lot of money but it actu­ally works and it’s Crater.

Crater is a project that Brian Anderson started and I’m now the main­tainer of, which creates exper­i­ments which get all the source code avail­able on crates.io and all the Rust repos­i­to­ries on GitHub with a Cargo.lock , so if you create an “Hello World” repo on GitHub, or an Advent of Code solu­tions repos­i­to­ry, that’s actu­ally tested for every release to catch regressions.

For each project we run cargo test two times, once with stable and one with beta, and if cargo test runs on stable but fails on beta then that’s a regres­sion, and we get a nice colour­ful report where we can inspect.

This is the actual report for 1.39 and we got just 46 crates that failed and those are regres­sions nobody reported before. The Release Team manu­ally goes through each (I hope we didn’t break any of yours), manu­ally checks the log and then files issues. The Compiler Team looks at the issues, fixes them and ships the fix to you all.

1.39 went pretty well. This is 1.38 and we had 600 crates that were broken, so if we didn’t have Crater there is a good chance your project would­n’t compile anymore when you updat­ed, and this would break the trust you have in the stable channel.

We know it’s not perfect. We don’t test any kind of private code because of course we don’t have access to your source code. But also we only test crates.io and GitHub, and not other repos­i­to­ries such as GitLab, mostly because nobody got around to write scrap­ers yet. Also not every crate can be built in a sand­box envi­ron­ment (of course we have a sand­box, we can’t just run any code with­out protec­tion because turns out people on the Inter­net are bad).

Crater is not really some­thing we can scale forever in the long term because it uses a lot of compute resources already, which thank­fully are spon­sored, but if the usage of Rust skyrock­ets we are going to reach a point where it’s not econom­i­cally feasi­ble to run Crater in a timely fash­ion anymore.

Those are real prob­lems but for now it works great. It allows to catch tens of regres­sions that often affect hundreds of crates and it’s the real reason why we can afford to make such fast releas­es. With­out it, this is my personal opin­ion, but I know it’s shared by other members of the Release Team, I would­n’t be comfort­able making releases every six weeks with­out Crater because they would be so buggy I would­n’t use them myself.

So to recap, the fast release cycles that we have allow the team not to burn out and to simply ignore dead­li­nes, and that’s great espe­cially for a commu­nity of mostly volun­teers. And Crater is the real reason why we can afford to do that. It’s a tool that wastes a lot of money but actu­ally gets us great results.

So I’m going to be around the confer­ence today so if you have any ques­tions, you want to imple­ment support for other open source repos­i­to­ries, reach out to me, I’m happy to talk to you all. Thanks!

Questions from the audience

You were hint­ing that maybe the edition idea wasn’t such a success for us. Would you think that jeop­ar­dises a possi­ble 2021 edition of the language?

The main issue wasn’t really the edition itself; it was that we actu­ally started work­ing on it really late. So basi­cally we went way over time with imple­ment­ing the features. This is my personal opin­ion, it’s not the offi­cial opin­ion of the project, but if we make another edition I want explicit phases where we won’t accept any more changes after this date and to actu­ally enforce that because we nearly burnt out most of the team with the edition. There were people that were just for months fixing regres­sions and fixing bugs and that’s not sustain­able, espe­cially because most of the contrib­u­tors to the compiler are volunteers.

For private repos­i­to­ry, of course you cannot run Crater, but how could some­body who has a private repos­i­to­ry, a private crate setup, would run Crater, or is that possi­ble now?

Some­one could just test on beta and create bug reports if they fail to compile. We have some ideas on how to create a Crater for enter­prises but it’s just a plan, an idea, and at the moment we don’t have enough devel­op­ment resources to actu­ally do the imple­men­ta­tion, test and docu­men­ta­tion work that such a project would require.

A lot of crates have pecu­liar require­ments about their envi­ron­ments. Can you talk about how Crater handles that and specif­i­cally is it possi­ble to customise the envi­ron­ment in which my crates are built on Crater?

So the envi­ron­ment does­n’t have any kind of network access for obvi­ous secu­rity reasons, so you can’t install the depen­den­cies your­self but the build envi­ron­ment runs inside Dock­er. We have these big Docker images, 4GB, which have most of the system depen­den­cies used in the ecosys­tem installed. You can easily check whether your crate works or not with docs.rs: since recently it uses the same build code as Crater, so if it builds on docs.rs it builds on Crater as well. And if it does­n’t build, you can file an issue, prob­a­bly the docs.rs issue tracker is the best place, and if there are Ubuntu 18.04 pack­ages avail­able we are just going to install them on the build envi­ron­ment, and then your pack­age will work.

How long does it take to run Crater on all of the crates?

Okay, so that actu­ally varies a lot because we are making constant changes to the build envi­ron­ment, changes with the virtual machines and such. I think at the moment running cargo test on the entire ecosys­tem takes a week and running cargo check , which we actu­ally do for some PRs, takes three days: if there is a pull request that we know is risky and could break code, we usually run Crater before­hand just on that and in those cases we usually do cargo check because is faster. The times really vary mostly because we make a lot of changes to the virtual machines.

Is it possi­ble to supply the Crater run with more runners to speed up the process?

I think we could. At the moment, we are just in a sweet spot because we have enough exper­i­ments that we fill out the servers, we don’t have any idle time, and the queue is not that long. If we had more servers then the end result is that for a bunch of time the server is going to be idle so we are just wast­ing resources. We have actu­ally more spon­sor­ship offers from corpo­ra­tions, so if we reach a point where the queue is not sustain­able anymore we are going to get agents from them before asking the commu­ni­ty. Also Crater is really heavy on resources: at the moment I think we have 24 cores and 48GB of RAM, 4 terabytes of disk space, so it’s not some­thing where you can throw out some small virtual machine and get mean­ing­ful results out of it.