Why Writing Firmware Is Kinda Like Software Exploitation

We’ve been away too long. We’re slacking…well not really. (Lawler did an excellent post recently on Power Analysis attacks , but we’re still slowly catching up on the blog…)

We’ve just been busy… we’ve been doing quite a bit of embedded reverse engineering and vulnerability research consultation recently (in addition to your normal infosec stuff). That work has been awesome. We’ve been lucky to have it. But the stuff that we’ve really been having fun with lately is writing firmware. Specifically for RF enabled microprocessors and ARM cores.

We’ve been working on a product called Tally.

In addition to our consulting load, my first employee (as of last January) Chris (an Electrical Engineer and ASIC Designer) and I have been spending many late nights pouring through CAD designs, product spec sheets, desoldering chips, and ripping firmware out of other products (and reversing it to see ‘how they did it’); and writing and debugging the firmware for our product.

(side note: We’ve now managed to turn some of this useless knowledge into a hopefully somewhat useful infosec course for vuln researchers.)

I haven’t really talked about our project much (‘cept for the occasional tangentially related tweets, and a few talks last year at conferences) because it’s been an arduous process and a really ambitious project (for me, just starting out doing hardware stuff). It’s taken a lot of my restraint to not post more photos, schematics, and other intellectual property (IP) to Twitter and Facebook. I didn’t want to talk about it too much because I don’t really know if we can pull it off as a consumer product.

Well we just reached a huge milestone at the end of last year (2012) and wanted to share a bit now over a year later: we’ve completely written the full alpha versions of our firmware along with the Mac-based and Android based control/interface software for our hardware device. (We even have a video animation and a website a bit about it. We’ve also completed the “from scratch” circuit board designs for our product (every resistor, every copper trace, and every component placed from scratch). We created and own all the IP from the circuit board and firmware up through to the client software. A few months back we completed the design review with an outside design firm and shipped our CAD drawings and BOM to the factory for fabrication and assembly! And a month or so back we got our first alpha hardware in our hands (Note:I wrote this this part of this blogpost in early 2013 ;-).

It is surreal to hold an electronics device that you created in your hand, everytime it boots our firmware or is detected by our phones and computers over USB and you get the command console…it is a really indescribable feeling. We can’t wait to have friends, colleagues, and family start using them more. We think people will really find the product useful. We’re really excited to share more about it as soon as we can in the coming months…but that’s not what this post is about.

Working on “solved problems” sucks:

If you are like me then you never liked working on “solved problems”. You prefer to do something interesting and new. You like to trample on untrodden snow….see something you haven’t seen before. This is why I have always been attracted to reverse engineering and exploit development. Even though I am not very good at either, I’d always aimed to get better at them both because (to me, early on) they seemed to represent a supreme comprehension of computing technologies.

When I started poking around at this “Hardware Hacking” stuff, I had no idea that within the year, I would be designing my very own 4+ layer circuit board, designing in CAD systems, writing my own firmware, 3D modeling and 3D printing cases and components, or designing my own RF protocol and implementing it in firmware on my own circuit board. All this just seemed too far fetched….But it goes to show that if I can do it, virtually anyone can do it…it’s not rocket science. In fact, it is all very fundamentally simple. Even more so if you have good help, smart friends, or can hire smart people.

With all this development I have learned heaps…In doing this work, I have pulled from seemingly disparate parts of my career thus far. Writing the firmware was C coding, which I borrowed from my time as a software developer. The ARM core that interfaces with our RF device was all programmed in C and assembly which pulled skills from our recent ARM exploitation course. I hand-built a few different kernels for a single-board-computer (SBC) we were testing for interfacing with our device, which gave me flashbacks to my early twenties working as a sysadmin building custom kernels (and our own custom linux distribution) for our datacenter Cobalts. I pulled on recent hardware vulnerability research to debug my firmware and write simple Python test harnesses from the PC. Even hand-wiring the connections to get my firmware to speak to me via UART (serial) gave me highschool flashbacks of hand-building null-modem serial cables to console into PBX’s and switches in wiring closets. When I started 3d printing different enclosures and cases for the device I had many recollections of building deathmatch levels for Duke Nukem and Quake (the first one).

But the newest stuff I’ve learned has been closer to electrical engineering. I’ve learned seemingly random stuff: like how the acidity and layout of a circuit board can change it’s RF properties. The Physics behind sensors that power technologies we unknowingly use everyday like hall effect sensors, accelerometers, gyroscopes, reed switches, humidity sensors, and so many more . I learned how to etch my own circuit boards with a laser printer. I learned about desoldering components like surface-mounted (SMD) flash and MCUs to access device firmware. I learned about JTAG debugging oddities with IDA on ARM cores. Even esoteric stuff like the very realistic effects of temperature on the oscillators that power small microprocessors:

(Side Story: We had a case while working on Tally where two Tally devices were falling out of sync when they chatted with each other via RF. One Tally device was on our workbench under a warm lamp, the other in our cooler break room. Turns out, the wrong initialization functions were executing on the workbench Tally device, because the oscillator chip, which is used for timing, was MUCH warmer, causing the crystal inside to behave differently! Temperature of the device changed the execution path of the firmware! WTF!)

Crazy stuff…

When I am doing something with limited resources and there are very few people or resources I can consult for help or guidance, this is when I feel most challenged. The downside (of course) is that when you get stuck, you’re really stuck. All programmers, exploit developers, and reverse engineers know the existential depression you experience during tough projects…or worse, failure. …but if you do have a success, the victory is all yours (or your team’s) and you and your team feel as if you 0wn the world (and depending on the finding, maybe you can ;-).

Well, I have found that writing firmware and developing embedded systems has challenged me in this way and really increased my understanding of how computers work in general. As a security researcher or software exploitation person, you can read about software interrupts (for example), but it really all falls into place when you actually implement interrupt handlers by hand in your firmware to process button presses or asynchronously process incoming RF packets. It makes you develop a respect for those nameless folks who write the firmware that we all obliviously rely upon everyday…like our microwave ovens, elevators, digital cameras, televisions, and cars. It is all really humbling. Hardware is software made flesh…so things happen in a different time-scale with embedded devices. Things are much more async, and much less procedural I’ve noticed.

In my recent embedded/hardware work I have had quite a few extremely illuminating moments, but I would like to share with you a specific one during which it became

clear that writing firmware and embedded software really was all “hacky”. When writing firmware, most of its elegance lays in its inelegance and simplicity…much like software exploitation. Writing firmware has shared many similarities to reversing and software exploitation in that way….You use what works, and while the final product may do something really impressive, generally what is under the hood is extremely underwhelming, simplistic, and occasionally…clever. This is why I (and most of us) fell in love with software exploitation and reversing engineering and is what I have come to enjoy about about writing firmware, building embedded systems, and making things that actually do things in the physical world. Things that even my mom can use.

The Setup:

One of the microprocessors we are using in the product is a MSP430 6137. This is a powerful little chip that costs only a few dollars per unit (this fact still blows my mind). It has tons of IO pins, plenty of built in flash and ram (for your program to run from and use for storage). It can speak several serial protocols (some of the other models in the MSP430 family speak USB and Bluetooth natively) and it has a built-in 900MHz radio modem.

Processors like these can be bought bulk as is (just a box full of chips). But more commonly (when starting out) you purchase them in some kind of development kit. The development kits come in many flavors these days. There are the consumer/beginner/hobbyist Arduino-like packages. Or you can opt for the more professional and task-specific “Product Evaluation Kits” (which are essentially the same as the hobby kits but a bit more “grown-up”: more functionality and less of the marketing glitz that is used to target the “Maker” crowds). (At Xipiter, we buy LOTS of Product Evaluation kits…from ARM kits, to bigger cellphone evaluation kits. They’re a great way to see what developers of the stuff we as exploiters/reversers want to target.)

These microprocessors, dont run “operating systems” (although some people have written them). Instead, you use a compiler like MSPGcc or an IDE like CodeComposer to compile your C and/or Assembly code into an executable image that is the ONLY executable your processor runs.

[Note: The magic of Arduinos is that they dumbed this process down a bit (which is great to get introduced to this stuff) by integrating the FET (more later on what a FET is) into the actual development board for the AVR and drastically simplifying the IDE and language used to program the device.]

When the processor turns on, it literally starts the instruction pointer at the “beginning” (entry point) of your firmware image (well kinda, it’s more according to the boot strap loader of the processor)

If you are using the more professional “Product Evaluation Kits”, you load your compiled executables onto the chip (generally directly from the IDE when you click “run”) using a FET (Flash Emulation Tool) which is a separate box that connects your computer to the processor on the Evaluation Circuit board. The FET generally just connects to your computer via USB and to the “Eval Board” via 14 or 20 pin JTAG header. This is one major difference between “Arduino-like” hobby kits and the more professional “Product Evaluation Boards”. The product Eval Boards generally don’t include the “FET” functionality on the circuit because it’s a waste to include all that if you plan to immediately start mass-producing or designing products that use the eval board’s reference design specifications….The “Maker“-friendly Arduino-like hobbyist kits, on the other hand, include all the FET functionality on the circuit to make it easier for newcomers to get started by plugging it directly into their computer via USB (without going through a FET).

<anti “Maker” rant> We live in exciting times. Hardware is all the rage. But with this comes lots of “noise”. I’ve seen lots of really inspiring/innovative ideas that owe their success to the proliferation of low-cost, powerful, and accessible developer kits like Arduinos, Launchpads, and Raspberry Pis. But I’ve also seen an increase in people making “products” based on Arduinos and Raspberry Pis. The reality is that you simply aren’t going to go to market with a product built from a hobbyist kit. You just aren’t. These are prototyping kits. Their magic is that they made that first step more accessible. The markup on these “prototyping boards” is too much. The overhead is too high, you don’t own the IP to the main circuit design, you’re at the whim of “hobbyist” suppliers (E.G. how you gonna ship 1000 units of your product “at cost” if you are at the whim of a re-retailer like SparkFun to ship you Arduinos and BeagleBoards?) This is, of course, contrary to the sophomoric evangelism of “Maker” folks who are usually web programmers that want to declare to their coworkers and friends with self-righteous indignation that they are “going lower level” or “getting into hardware”. There are far too many egotistical “Makers” running around that think their Arduino projects are industry-grade and market ready. It takes more than that. You’re not gonna take on Boston Robotics with your Lego Mindstorms masterpiece. And throwing money at it doesn’t matter. You have to build your circuits by hand. Or at least you have to own the IP: gerbers, firmware, and all. If you plan to produce in bulk, and at-cost to your consumer, you have to do a bit more work than that. Sorry.</rant>

Anyway…in addition to getting your firmware onto the chip, the FET also allows you to hardware debug your firmware in real-time as it executes on the MCU. All the basic debugger functionality is there: breakpoints, watchpoints, stepping, over-stepping, memory views, etc. Sometimes however things can get a bit hairy. And in this one case, this happened to us.

The Hardware HeisenBug:

There is a lot of code in the Tally firmware. This code handles everything: serial communication (over UART) to the PC, SPI/I2C code to sensors, talking to the RF transceiver, talking to the memory cards…everything. (Oh and most of this has to happen asynchronously in Interrupt Vectors, so locking, critical sections, clever exception unwinding, and careful attention to the stack is crucial. Everything we learned from that insanity could fill another blog post, so we’ll skip it for now.) Nonetheless, because of all this code, the beginning is a very delicate time. The boot sequence is very important. It initializes all communication paths, and even the oscillators on the board that are used to determine clock frequencies for most of the communication paths (like the radio).

So as firmware developers, if we are debugging code that runs after these initialization routines complete successfully we can use our usual techniques like debug printf()s to get output from the device over our serial cable. And indeed that’s what we spend most of our time doing. But what if the bug happens in the initialization routines before we can initialize the UART to get output? Well then we can probably just use the JTAG connection right? Well in this one case we couldn’t. The particular function of interest was this one: “Initialize_UART()”

This function calls down into another function called UART_init which lived in: Tally_bsp.h

(side note: “BSP” is a term that is very common in embedded system work and usually stands for “Board Specific Package”. Generally it is home to values, variables, globals, and code specific to that revision of the board, like bit-masks, memory mapped IO addresses, and other values that are linked to physicality of the board like pin numbers or components in the device.)

In this case, the Init_UART() function down in the Tally_bsp.h was mostly code I’d written over a year ago that heavily borrowed from one of Texas Instrument’s tutorials for interfacing with the FT-232R (Note: The FT-232R is the controller on Tally that orchestrates all communication with the PC/Mobile phone over USB ). We needed to see what the return value was for that function…but for some reason none of our watchpoints or breakpoints would trigger in those initialization routines (we suspect this is because of the oscillator, but we never found out why). To make matters even more annoying, when we hard-coded breakpoints into the function and recompiled the firmware with ONLY those changes, the initialization routines would run successfully! WTF!? This was a hardware heisenbug. My immediate first thought was:

“F*ck it, I’ll just leave the breakpoints in there if they’ll make it work…“

but unfortunately this won’t suffice because when the hardware debugger is detached there will be nothing to catch the exceptions. The firmware simply won’t run like that….To debug this problem, we needed to see what the return value was of that UART_Init() function was… So what to do? The solution we came up with is the seed for this blogpost because it seemed like a great example of how similar embedded development was to software exploitation. All the tricks and contortions you have to do to make sh*t work…

The Solution (Das Blinken Lights):

I remembered hearing about or seeing a talk by the Magic Lantern guys (who hack DSLRs). I remembered hearing that they used the Infrared Port (which is used for firmware updates on some devices) to exfiltrate stuff from the camera.

So our solution was similar: Use the LEDs to blink values (such as the Init_UART() return values) back to us to begin debugging the problem.

After dicking around in “real” C on my desktop, the “MSP compiled C” in CodeComposer, this simple little stub of code basically did all the work.

Bitmasking and LED flashing would (under any other circumstances) seem either trivial or like a bunch of academic fapping. I always used to make fun of stuff like this…but in this case, it was an extremely helpful trick once integrated with the firmware code which looked like this:

By inserting the code in the firmware (where needed) we could have the two LEDs on Tally blink any values we wanted, but specifically the return addresses and values of critical functions that were failing. We converted the values we wanted to binary with a simple loop and bitwise math, and then used the LED values to blink according to the corresponding bit values. But how to signal effectively? Do you blink once for “one” and twice for a “zero”? How would we know which blinks were which? Here’s what we did.

We used the right LED (the “Rx” or “receive” LED) to act as a metronome for each bit value, and the left “Tx” (“transmit”) LED to blink or not-blink a value like so:

X — X: Tx on and Rx on is our signal for “one”

O — X: Tx Off, and Rx is on was our signal for a zero

X — O: Tx on and Rx Off was our signal for “moving to the next digit”

We would watch the LED blinks, write down their values backwards (Least Significant Byte would blink first with the code we wrote) and then use Python to covert what we saw back to a numeric hexadecimal or decimal value we could use to debug the problem.

We used this technique about four or five times before tunneling down to locate the root problem. The root problem was incorrect “wait” values set for two peripherals down in Tally_bsp.h. We hadn’t changed those values since we modified our board routing (adding components) after revising the product from it’s earlier PocketDrops versions. The initialization routines weren’t waiting long enough for crystals in the oscillators to stabilize. This explained why when we inserted breakpoints the outcome was different. The breakpoints gave enough time for the oscillators to stabilize and the components to successfully initialize!

That’s it. Pretty lame on the whole, but kinda neat at the same time.

Conclusion:

Being forced to use tricks like this is what I came to love about reverse engineering and software exploitation. It is this kind of exhilarating low-level tedium that only software exploitation can offer. Any exploiter worth his salt gets a sick little thrill out of simple little stuff like this: Whether it be stepping through a busted copy loop to observe the *exact* moment of a critical overwrite in your exploit; working with only 4 break-on-access hardware breakpoints as you debug your ROP payload; or stepping through a tricky part of your shellcode. These are the great quiet moments that made me fall in love with software exploitation…and it’s kinda neat to now see the same kinda things as I am learning more about hardware and firmware design…….so I just thought I’d share.

That’s it. Here are some photos of random stuff from working on Tally.

Tally Android app connected to Tally device Prototyping is the same for hardware as it is software: shit inexplicably everywhere Tally Android app connected to Tally device Bare Tally Board A late night spent reversing in Eagle CAD software with Chris 2Dec2012 Tally bare Boards Tally Boards being tested 3D printing test cases for our Tally boards. Another Tally closeup The LED blinking crap… int2bin the problem code