#Rust2018

The Rust team encouraged people to write blog posts reflecting on Rust in 2017 and proposing goals and directions for 2018. Here’s mine.

Rust in 2017

I don’t have anything particularly insightful to say on reflection. In 2017, Firefox shipped significant Rust code. A crate that I wrote, encoding_rs, was part of what shipped in Firefox. Which was nice.

Rust is awesome. I think it’s the most important development in systems programming in maybe the last 20 years. It gets Unicode right. The string types make sense to me. Slices are cool. It’s great that the correctness of lifetime management is moved from the programmer to the compiler. (After all, the concept of lifetime is extremely relevant to C and C++, but with those languages, it’s the programmer’s responsibility.) I don’t understand why people complain about the syntax, which is closer to the C-ish mainstream than e.g. Python’s syntax and Ruby’s syntax.

Commitment to What Already Works

I don’t have a wish list of Rust features that don’t already exist. Rather, I mainly wish that in 2018, Rust committed to certain things that already work in nightly Rust and have worked for a couple of years now.

simd -Style SIMD

Back in 2015, Huon Wilson started working on SIMD support for Rust. The work stagnated and lost its author as a champion of the cause within the Rust team when he left to work on Swift.

I have used the simd crate in encoding_rs and find the simd crate way of exposing SIMD superior to the way SIMD is exposed in C. Yet, I feel that the design is under-appreciated in the Rust community, and it bothers me to see a push in the direction of matching C more closely as opposed to committing to the (in my opinion better) design that Rust already has.

While I do use vendor-specific SIMD operations in encoding_rs and I agree that Rust needs to, for operations not covered by the cross-ISA operations provided by LLVM and exposed to Rust by the simd crate, expose vendor intrinsics like C does to provide access to the full SIMD feature set of a given ISA, I think it’s a problem to treat it as the first step of what is going to appear on non-nightly Rust. Vendor intrinsics are lower-level than the functionality provided by the simd crate, which superficially suggests that it makes sense as the first step. However, doing the lower-level part first risks a lot of wasted effort ecosystem-wide as people will try to piece-wise re-create abstractions that LLVM already has and that the simd crate exposes or, alternatively, publish less portable code than the availability of the simd crate’s functionality would naturally lead to. To avoid that wasted effort and churn or the crate ecosystem becoming more Intel-coupled (I expect vendor-specificity to steer towards the incumbent) than it has to be, I think it’s important that the feature set of the simd crate appears on non-nightly Rust before or at the same time as vendor intrinsics.

Specifically, I think the first step of SIMD support on non-nightly Rust should include the following things that already exist in nightly Rust. I emphasize that even though the list might look long, I’m not talking about vaporware but I’m talking about things that already exist and work on nightly Rust.

Lane-aware types, such as u8x16 and u16x8 . (As opposed to lane-unaware __m128i as seen in Intel intrinsics in C.)

and . (As opposed to lane-unaware as seen in Intel intrinsics in C.) Basic arithmetic trait implementations ( Add , etc.) mapping to portable cross-ISA LLVM vector operations.

, etc.) mapping to portable cross-ISA LLVM vector operations. Basic bitwise trait implementations ( BitAnd , etc.) mapping to portable cross-ISA LLVM vector operations.

, etc.) mapping to portable cross-ISA LLVM vector operations. Lane-aware boolean vector types (named bool16ix8 in the simd crate but could be bikeshedded to b16x8 ) that signal via the Rust type system that for each lane all bits are either one or zero.

in the crate but could be bikeshedded to ) that signal via the Rust type system that for each lane all bits are either one or zero. Basic lane-wise comparisons, such as eq() and lt() that map to cross-ISA LLVM vector operations and yield boolen vectors.

and that map to cross-ISA LLVM vector operations and yield boolen vectors. Operations any() and all() that efficiently check if any lane is true or all lanes are true given the precondition signaled by the definition of boolean vectors (that all bits on a given lane are either one or zero). (The entire purpose of boolean vectors is to signal the precondition that allows these operations to be efficient. E.g. in the SSE2 case, the implementation would look at every 8th bit, but trusting the precondition to be true makes it OK not to actually look at the rest of the bits.)

and that efficiently check if any lane is true or all lanes are true given the precondition signaled by the definition of boolean vectors (that all bits on a given lane are either one or zero). (The entire purpose of boolean vectors is to signal the precondition that allows these operations to be efficient. E.g. in the SSE2 case, the implementation would look at every 8th bit, but trusting the precondition to be true makes it OK not to actually look at the rest of the bits.) Safe reinterpretation of a boolean vector as an integer vector of the same lane configuration and integer vectors with different signedness of the same lane configuration.

Operations for extracting or setting values for a particular lane.

Vector shuffles that take two vectors (can be the same one twice if not caring about another one) and compile-time-constant array lane indices and yields a vector containing the indicated lanes from the input vectors. This is not presently part of the simd crate but is exposed by rustc in nightly Rust. Also, unlike the other features on this list, the performance of the generated code for this operation depends heavily on the particular shuffle indices and the quality of implementation of the compiler back end. Still, this is a very useful operation when the programmer is willing to inspect that the compiler back end can handle the requested shuffle sensibly. Type system-wise, an array of integers that have to be compile-time constants is an oddity, but rustc already deals with it.

It’s worth pointing out that boolean vectors in particular are messy to retrofit into a design that doesn’t anticipate them, because they need to appear as return types of every operation that has the property that the operation sets all bits of each lane either to one or zero. This is a data point against an incremental approach, especially when the non-incremental design already exists and works on nightly. (Well, to be precise, works on x86, x86_64 and aarch64. any() and all() are broken on armv7+neon at the moment.)

Rust bool in FFI is C _Bool

FFI code is in practice written with the assumption that the representation of Rust bool is the same as the representation of C _Bool . For example, various bits of FFI code in Firefox depend on this assumption. Not only is code written with this assumption, but the code works, too. And not just on nightly Rust but on non-nightly Rust, too.

In that light, it’s rather surprising that despite non-nightly Rust allowing bool in FFI, there hasn’t been an explicit decision that it’s guaranteed to match the representation of C _Bool .

I wish Rust committed to bool working in FFI that same as C _Bool (and, more to the point, pointers to bool working the same as pointers to _Bool ) at least on systems that use the System V ABI or the Microsoft ABI (i.e. all actually relevant non-embedded systems). After all, breaking it isn’t practical anyway.

Non-Nightly Benchmarking

The Rust-side features that enable cargo bench should work on non-nightly Rust. It’s a useful feature, and despite concerns about the inelegance of the reserved crate name test , the design of the feature has been de facto stable for a couple of years. It’s time to let go of the possibility of tweaking it for elegance and just let users use it on non-nighly Rust.

Debug Info for Code Expanded from Macros

Different use cases lead to different wishes of how debug info is attached to code that is expanded from macros (whether expanded code is attributed to its source or the macro invocation site). The default caters to one use case. For another use case, nightly has -Z debug-macros , which is under threat of being removed. Instead of having to make the trade-off so that one behavior fits all macros, I wrote an RFC about allowing the trade-off to be made on a per-macro basis. In this sense, this wish list item involves a bit of new compiler work and isn’t just a matter of rubberstamping an existing nightly feature as is. My understanding is that the RFC is non-controversial at this point, so my wish for 2018 is that someone who knows their way around the compiler code finds the time to implement it.

Tools for Understanding the Binaries

While the above items are mainly about rubberstamping something that already exists and is very clearly about Rust, this section involves bigger wishes of things that don’t already exist and that wouldn’t be very Rust-specific, either.

GUI for rr replay

rr is an awesome debugging tool that works with Rust thanks to gdb working with Rust. As part of the process of developing Rust code, rr is particularly useful for understanding the failures that cargo fuzz finds.

Unfortunately, since rr wants to be the one launching gdb, it doesn’t Just Work with gdb front ends that also want to be the ones launching gdb.

My first tooling wish for 2018 is an rr GUI front end that I could launch with a simple command in place of rr replay that would show a window with my source code, the runtime stack and the local variables. The local variable view should have disclosure triangles for navigating inside structs and enums. As with plain gdb, the source discovery should work without having to put the source code into a “project” ahead of time. So for the UI design, something close to Eclipse’s gdb front end would work except for the part of creating an Eclipse project containing the source code ahead of time. A checkbox for stepping over the Rust standard library when stepping would be great.

But what’s #Rust2018 about this? Mostly just the object view understanding Rust enums and stepping understanding what’s part of the Rust standard library.

Tool for Understanding What LLVM Did with a Given Function

Sometimes I write a performance-sensitive function and the benchmark results are unintuitive. I want to understand why. At present, I’m working on making Gecko’s text node manipulation (hopefully) faster by SIMD-accelerating the relevant operations using Rust (again, I find Rust’s simd crate nicer to work with than what C++ inherits from C). The operations I thought were the easiest wins actually were not (especially on aarch64), and I wanted to understand why. I wish the tooling for doing so was better.

My second tooling wish for 2018 is an assembly visualizer designed for the programmer seeking to understand their own code as opposed to being designed for assembly experts who are reverse engineering someone else’s malware.

(On Linux,) I’d like to be able to designate a function in my code, have it compiled with release mode optimizations and debug info as if it was #[inline(never)] and have the result shown in a view with four columns: Arrows for jumps, the assembly listing, succinct English description of each instruction and snippets of source code that the assembly is attributed to.

If I start by marking the function #[inline(never)] manually and come up with an executable to link my crate to, I can see the arrows for jumps and the assembly listing in Hopper. Alternatively, objdump -S gives me an interleaving (as opposed to side-by side view) of the assemby and the source. I’m not aware of any tool that explained each instruction or at least hyperlinked each instruction to documentation.

Existing tools that draw the arrows for jumps are disassemblers. They are so focused on the use case of reverse-engineering someone else’s sourceless binary that they don’t support source attachment when debug information and the source code are available!

Additionally, the framing that disassemblers are for use by people who do reverse engineering professionally leads to an assumption that the user is proficient at reading assembly language. This isn’t necessarily true for the use case of understanding what the compiler did with the user’s own code. It isn’t in any way an abnormal situation to be writing the kind of code whose performance diagnostics make occasional assembly reading relevant without ever having to write assembly language to gain or maintain proficiency. Also, computer science education teaches concepts and isn’t supposed to be vocational training, so one might well understand the concepts of assembly language and have written a tiny bit of MIPS assembly or a (MIPS-targeting) toy compiler years ago without knowing the meaning of all the x86_64, armv7 and aarch64 assembly mnemonics by heart. In general, I feel that the use case of reading assembly for the purpose of understanding what the compiler did with one’s own code is under-addressed both in tooling and in tutorials/books.