If I have one skill, it’s starting a whimsical project and then immediately getting side-tracked by the dozens of yak shaves it ends up spawning.

Anyway, last time I promised to talk about my highly advanced robot:

However, I’ve since realized there are way too many topics here, so instead let’s play “choose your own adventure” — email me if you want more details on any of the following and I’ll write a blog post or reply with a pile of notes:

This robot consists of two DC motors driven by an L298N driver via a Raspberry Pi Zero W; the Pi is powered via the adapter it came with and the motors via a adjustable supply at 12 V

The robot body is a piece of scrap OSB I had laying around with motors attached via metal brackets and the wheels CNC-machined with a hexagonal slot to fit the couplers that came with the brackets

Raspberry Pi control code was written in Rust, cross-compiled from my desktop (because good luck compiling Rust on the Pi’s baby CPU)

Remote controlled via a spacemouse connected to the desktop, with a Rust program forwarding UDP packets to the Pi over wifi

Of course, as soon as I got all of this working I decided that the Raspberry Pi was cliché and things like “wifi”, “ssh”, and “an operating system” were making me soft, so I immediately started over with a microcontroller:

The stm32g4 has 6 built-in op-amps and 7 comparators (which I can use for the lighthouse positioning system) and 128kB of RAM (aesthetically superior to the Pi’s 512MB)

Wireless control via HTC-05 serial-over-bluetooth modules instead of wifi; you want this old “bluetooth classic” module, which will show up on your computer as /dev/tty.thing-you-can-write-to (the newer “bluetooth low energy” modules are too fancy for such salt-of-the-computing-earth usability)

Since the robot now only gets bytes instead of packets (no operating system, remember?), I’m leveraging constant overhead byte stuffing and decoding the raw stream using one of the g4’s UART devices

Nema 17 stepper motors replaced the DC motors — not because I need the precise position control of counting steps (I’m closing the feedback loop via the lighthouse), but because $15 stepper motors have less backlash than $15 motor gearboxes

That’s the gist, write me if you want more details.

Thoughts on embedded Rust

After a few months, I have some thoughts about all this:

RTFM is an amazing framework that makes concurrency safe, fast, and easy (compared to C, anyway); it leverages hardware priority levels to prove (at compile-time) that resources can be shared without runtime overhead (i.e., mutexes), includes a low-overhead task scheduler, and eliminates the hard-to-debug concurrency/interrupt memory corruption nonsense that would absolutely be plaguing me if I were coding C.

Cargo is great: It’s easy to split code up in packages, run on both desktop (for debug/testing) and microcontroller, and also grab handy open source packages.

Shout-out in particular to bitvec, a package for easy manipulation of bits. Seriously, I’m decoding one-bit-at-a-time from timed flashes of light, and nowhere does my code include inscrutable bitwise shifts or boolean operations; this is amazing for code readability (and my sanity).

Let’s talk about abstraction. The lowest-level interfaces (short of typing in the memory offsets by hand) are the device-specific “peripheral access crates”, which are automatically generated from manufacturer-provided documentation and expose all device registers in a uniform way.

For example, the code:

device . GPIOA . moder . modify ( | _ , w | w . moder6 (). bits ( 0b10 ));

means “write binary value 10 to moder6 field of GPIOA’s moder register”. What the heck does that mean? It puts pin A6 into “alternate function” mode, though the crate documentation won’t tell you that (it’s autogenerated).

The upside of this PAC API is that, after you read and understand the 2,000+ page microcontroller reference manual, you fully understand the API.

On the downside, well, aside from being a bit verbose, the PAC API doesn’t prevent mistakes like writing the wrong bit value ( 0b01 , say) or accidentally trying to later use pin A6 for some other purpose.

So the wise Rust folks have written “HAL” (hardware abstraction layer) crates on top of the PACs. Not only does the HAL reify values as enums ( Mode::AlternateFunction instead of 0b10 ), it also leverages Rust’s super-fancy type system to catch mistakes at compile-time. E.g., the function call that sets A6 to alternate function mode, might “consume” (take ownership of) the pin, which means the compiler will error out if you try to use that pin elsewhere.

A lot of this is quite clever and powerful, check out this blog post overview or the embedded rust book.

There are still a few rough edges / things that tripped me up though:

My editor autocomplete (powered by RLS) doesn’t work with the closure-based API: Pressing tab at the end of device . GPIOA . moder . modify ( | _ , w | w . mod does nothing. This is particularly painful since pretty much all of the register names and values are inscrutable acronyms. I’m with Ben Kuhn that autocomplete is an interface, and I wish it worked better in typical embedded Rust crates.

The PAC reifies every peripheral as its own type. So even if, e.g., timer 1 and timer 2 have identical functionality (same register names and field values), you can’t easily write a function that accepts both: fn setup_timer ( t : ??? ) { t . arr (). modify ( | _ , w | w . arr (). bits ( 500 )); } because the timers have different concrete types. You could use a generic type, but that ends up leading to a world of pain (RTFM resources cannot be generic; extra typing (pun!) everywhere). The best option is to use macros: macro_rules ! setup_timer { ( $t : expr ) => { $t . arr (). modify ( | _ , w | w . arr (). bits ( 500 )); } } which is how most of the HAL crates are written (example). This isn’t as bad as macros in C (you don’t need to worry about hygiene; things will be type checked), but it is new syntax to learn, and autocomplete definitely isn’t going to work now.

The PAC write method resets non-written fields to their default (“reset”) values. So if you have a register FOO with fields that default to zero (unset) and you run: device . FOO . write ( | w | w . a (). set_bit ()); device . FOO . write ( | w | w . b (). set_bit ()); at the end FOO will have field b set and field a unset. I’m sure this makes sense in some context (any write to FOO must be a full byte and emitting code that writes default values leads to fewer machine instructions than read followed by write, etc.) but it tripped me up several times when I added a second write (to enable some completely unrelated flag later in my code) or changed the order of existing write calls. So now I only use modify and just ignore the first argument to the closure, which makes things even more verbose.

All-in-all, though, I’m still pretty jazzed about the possibilities from using Rust on embedded hardware.

Yak shaves

I keep a running list of yak shaves on my website; the following are just the latest, robot-inspired ones:

Unified hierarchical search : Nothing highlights Preview.app’s shortcomings like searching for 3-letter acronyms in 2,000 page PDFs. I’m developing a hierarchical search / notes mechanism around PDF tables of contents and may release an app on this theme (or perhaps roll it into Finda). Let me know if you also dream of, uh, computers searching text good.

No-code hardware initialization : Probably half of my robot “programming time” has actually been me cross-referencing (via PDF!) obscure register names, values, and peripheral bus interconnection tables. But 90% of my peripheral-related code is just initialization, which I don’t actually want or need to think about as code. I just want the hardware to fire the right interrupts to run what I actually care about — my application code that decodes data, blinks the lights, spins the motors, whatever. The hardware initialization code might as well be write-only generated goop — chip manufacturers actually provide GUIs like CubeMX to do this, but I don’t think they go far enough. What I really want is a proper solver: My g4 chip has 4 UARTs, 20 timers, 6 op-amps, and 7 comparators; I want the maximum number of op-amps wired into comparators into timers and just a single UART — you tell me which peripherals and internal buses I can use without pin conflicts. This probably isn’t even an SMT problem, it’s likely SAT. Make a good single-page web app that generates C / Rust / Assembly and let the highly-targeted firmware engineer advertising money roll in.

Inferring performance bounds from assembly: ARM assembly is soo much simpler (peep this tutorial) than the x86 assembly I was looking at this time last year. How difficult would it be to calculate time upper-bounds for a given function or interrupt given its generated assembly? Yes, this is impossible in the general case because loops, the halting problem, etc., but how far could one get making a “performance linter” to warn you that what you thought would take a microsecond actually might take a millisecond? I’m sure this kind of thing exists already — please email me your favorite papers / projects.

Misc. stuff

Until next time my friends!

Kevin