Introducing Rust in an Enterprise Environment

Discovery

Sometime in 2013, on the internet I stumbled over a new programming language called Rust. Taking a look at the language, I was impressed by its high level features. At that time I was a backend Scala developer with a .Net background. When examining Rust, I found most of the features I used every day like Pattern Matching, the “New Type Pattern” and a “Scala like” Iterator API. But there was also something I really missed: No Nulls and no Exceptions. While also being a low-level language without a garbage collector I was convinced to further follow the language progressing.

Early Prototyping

It was in 2016 when I joined Zalando as a Scala Developer. After half a year we were thinking about introducing a new application for a simple task. Somehow the question came up on what technology to use and Rust was suddenly mentioned. We did a prototype quickly, and implementing it was quite easy. It also turned out that implementing a domain model was very painless, especially regarding serialization due to Rust’s high level abstractions. Unfortunately, we did not need the application anymore but nevertheless Rust proved to be a valid candidate for solving our problems.

The Experiment

A short time later, we had some problems with our main service. It was a Scala web service that resides at a critical position within Zalando. Under high load, the application consumed great amounts of memory and sometimes even crashed with the GC running out of memory at almost 100% CPU load. This forced us to massively overscale the application. So we asked ourselves what would happen if we rewrote the application in Rust. We did just that and it took just a few days to reimplement the application. Load tests revealed that the Rust application had much better latencies, consumed less memory and less CPU than the Scala application under the same load and, even better, it could handle more load without crashing. It is of course always easier to rewrite an application than writing it from scratch.

We added some more features and then considered to take it live. This was where we faced the first challenge. Our lead reminded us that Rust currently is not an “official” technology within Zalando and that taking the application live would be a serious risk. That was of course correct. Our lead asked us to collect the requirements for safely taking such an application live.

Afterwards, we approached Zalando’s Technologists Guild and presented our results during a Tech Stand Up. With our Technologists Guild, we came to the conclusion that Rust should stay with the “Assess” state on our Tech Radar until we gathered more experience. We also collected requirements for deploying a Rust application but unfortunately things came to a halt since we had to focus on other topics.

What happened was that we started to implement some tooling in Rust.

Justifying Rust

It was in the middle of 2017 when we needed to implement a new service. By that time we already had a Rust Study Group running and the Rust ecosystem evolved further. Since we knew that we couldn’t just start a service, we asked our lead whether we could do it with Rust. It was a simple streaming application doing some REST calls and writing data to Redis.

We asked our lead and again he had serious concerns. We would need really good reasons to use Rust over Scala, which was still our main technology stack. He also had serious concerns on whether the tooling was ready for productive usage and the question on how to onboard new team members with such a technology would also have to be answered too. There were of course more questions and the stakes were high, but completely understandable from a lead’s perspective.

In the following weeks, reasons for using Rust were collected. We started to analyze the problems we had with our current applications and figured out how those problems could be avoided with Rust. Of course there was also the performance argument but that was definitely not the most important reason. The main reasons were Rust’s safety and productivity features. But there was one more thing: With Rust we were able to use resources efficiently and there was already the plan to move to Kubernetes. Being able to have small pods running on Kubernetes could be a real cost saver.

There was a lot of communication with our lead and we got valuable feedback on the topics where we might need a bit more reasoning. Well, things were moving slowly and the end of the year was near. At that point in time we had serious doubts that we would ever use Rust for productive systems.

When things become real

It was at the end of 2017 when it was announced that the teams would be restructured due to changing requirements. We were a team of six developers and would be reduced to four. When this was announced to us there was also another revelation: Our lead said that from now on we would be a “Rust Team”. That was really unexpected and I have to admit that I did not really know what to respond to that.

Since we were planning to replace our old system with a new one, we almost immediately started to implement the first service we needed. It was a rather simple CRUD service, which was a good opportunity to onboard some of the team members to Rust. The service was ready to be used more quickly than expected, even though it was not yet fully finished. Since we needed more applications to reach our goal, we started to implement the smallest applications in parallel, thereby gradually increasing the difficulty level for the team to the final service which fully utilizes non-blocking IO.

In the end we managed to reach our goal in time, thereby introducing a new technology. Currently we have two REST services, a streaming application and multiple batching applications written in Rust all running on Kubernetes. The new applications have been live serving data for two countries over 2018 and are expected to serve even more countries in the near future. The resource usage of our applications is far below our former Scala services and reduce costs remarkably.

Conclusion

With Rust, one can build microservices taking the word “micro” literally. Rust gives the developer an “if it compiles it runs” experience which allows focus on business logic. Refactoring and even reengineering can be done quite fearlessly. The compiler is very helpful and even suggests solutions. A newcomer coming from Scala or C# already knows concepts like closures and the Iterator API which makes things a lot easier. And there is the borrow checker. Given enough support, newcomers can learn to handle it while still being productive. But one still has to be a bit resistant to pain when it comes to compile times and a lack of an easy-to-use “corporate” version of crates.io. When starting a project it is beneficial to have an experienced Rust developer on board and to not just start from scratch. We are still waiting for a stabilization of futures and async/await and the web ecosystem to become more mature since it is currently a challenge to choose an appropriate web framework/toolkit.

For us, Rust has so far been a story of success and it is likely that it will stay like this.