A computer must perform a series of initialization tasks when turned on before it becomes ready for use. Once these tasks, generally referred to as booting, are complete, control passes over to the main system. On many systems, especially on specialized systems such as video game consoles, there is a fixed set of initialization routines for the boot process stored in read-only memory inside the system. This is usually referred to as the boot ROM.

When emulating a system, there are two ways to emulate the boot process. One approach is to initialize the emulated state to reflect the state of an already booted system. But from an accuracy-focused, low-level emulation perspective, starting the emulation from a clean slate and run the boot ROM directly is often more desirable.

There are advantages and disadvantages to both approaches, but the most notable disadvantage to the boot ROM approach is that it necessitates having a copy of the boot ROM itself. Since the boot ROM is actual code and not a system design, it is potentially copyrightable and leads to concerns with distribution. Thus many emulators that use a boot ROM require users to obtain a copy separately from the emulator.

Further, many systems contain protections to prevent the boot ROM from being directly accessed or dumped. All Nintendo handhelds have protections. But due to the complexity of these boot ROMs many emulators actually require them to be provided to run at all. However, these protections make it difficult to dump the boot ROMs.

On some systems accessing the boot ROM from software is simply not possible without hardware modifications. Such is the case for the Game Boy, which completely removes all traces of the boot ROM from memory before handing any control over to the game. More complex boot ROMs may be multiple stages, with the earlier stages being progressively more difficult to dump.

The downside of more complex boot ROMs is that they often contain security vulnerabilities that can allow code execution early in the boot process. The 3DS’s “sighax” or the Wii’s “trucha” bug are two notable examples. Early code execution can be an easy way dump the entirety of a boot ROM. While it may be possible to mitigate some of these bugs in software, the issues can only be properly fixed via a hardware revision. Depending on when in a system’s lifecycle these issues are publicized a hardware revision might not occur.

The Game Boy Advance sits in a middle-ground between primitive boot ROMs such as the Game Boy’s that mostly just exist to bring up a logo and run basic checks on the cartridge, and more advanced systems like the 3DS which contain a full operating system. Before the DSi, no Nintendo handhelds contained operating systems. However, the GBA and DS contain code in ROM that act as utility functionality to the game software in addition to the boot code. This utility code is what leads to the ROM being referred to as BIOS instead of boot ROM, and To avoid having to reimplement this functionality many GBA and DS emulators require copies of the BIOS. It also means that at least some of the BIOS must remain present in memory at all times, so the simple trick of merely removing the boot ROM from memory isn’t possible.

The GBA BIOS is very small, only 16 kiB, so it only contains a small amount of code. A good chunk of it is data, such as the boot logo and sound effect. It also contains routines for copying memory quickly, doing decompression, some basic math operations, some sound functionality that is used by about three games total, and a handful of low-level hardware interactions. Due to the design of the ARM CPU that the GBA uses, it also includes the interrupt vector table.

The interrupt vector table is used by the CPU to know what code to run under specific circumstances: where the boot code is, what happens when an interrupt request occurs, what happens when a trap, system call or other software interrupt occurs, and a handful of other things. In practice, only hardware interrupts (IRQs) and software interrupts (SWIs) occur on the GBA after initial boot or reset. The ARM architecture defines two specific locations in the memory address space where the interrupt vector table can reside, and as such the entirety of the BIOS is located contiguously at the very base of the address space.

Nintendo’s primary approach to preventing unintended access to the BIOS is simple: if the currently executing code is in BIOS, you have full access. Otherwise, you have no access. The former is absolutely required: the boot code and many of the math functions absolutely require data lookup to be able to function at all. The latter prevents games from easily being able to dump anything. And since games call the BIOS functions by using software interrupts, they don’t need to know any of the layout of the BIOS; simply use an identifying number for the function and the BIOS looks up where it is while running the software interrupt handler. Moreover, all of the routines that are used for copying or accessing memory have checks to prevent them from being used to copy out BIOS memory.

Well, All of them except for one of the sound ones. Software interrupt $1F, sometimes known as MidiKey2Freq, is intended to be used for converting musical note from a MIDI key to an actual frequency. This is useful for transforming song data into a format that is directly applicable to audio mixing code or hardware. However, the first input to this function is a memory location that contains song data, and not a raw value. Nintendo neglected to add protections to this function and GBA homebrew developer Dark Fader noticed. This approach could only dump one byte at a time and proved to be very slow, but this was a single oversight in the BIOS code that allowed the entire BIOS to be dumped. This was the canonical approach for dumping the GBA BIOS for many years. After all, the implementation of exploiting the bug was extremely simple.

While this was sufficient to dump the BIOS, it was thought to be the only way to dump the BIOS. All of the other functionality had been searched for vulnerabilities and no others were found. Fast forward to 2016 when I got to thinking about newer techniques in software exploitation, particularly one called return-oriented programming, or ROP for short. In brief, modern CPUs don’t allow you to execute arbitrary memory anymore: just because you can write to it doesn’t mean you can run from it. Only locations of memory that are marked as “executable” by the MMU can be run. ROP bypasses this by cherry-picking the very end of various functions that manipulate the state of the system in specific ways and chaining the end of that function call with a jump to the end of another function call that does another very specific set of operations.

The creation of this chain is done by modifying the call stack directly instead via a memory bug of some sort (or creating a new stack and switching to it using a “stack pivot”). The CPU keeps track of which function called the function you’re currently in to know where to go after it’s done with the function, as well as any local variables in that function that will need to be restored when the function starts executing again.

With clever crafting these ROP chains can effect arbitrary state changes, obviating non-executable memory being an obstacle. Additionally a similar, older technique that can be used in conjunction is called a “return-to-library” or “return-to-libc” attack. Instead of jumping to the end of a function you jump to the beginning of a function that does a set of standard operations before returning again where it left off. One example is being able to put a memory allocation in the middle of a ROP chain. At one point I got to thinking, is it possible to construct a return-to-BIOS attack on the GBA?

I previously mentioned that the way to call a BIOS function is via a software interrupt. When a software interrupt is triggered it jumps directly into the software interrupt vector in the BIOS which handles everything internally to the BIOS. However, while the function is running it is possible that the CPU will receive a hardware interrupt, leading to the CPU running user code in the middle of a BIOS function before it returns execution to the BIOS function.

This means you can easily write code to start a memory copy in BIOS code that will have an interrupt fire before it’s complete. But the stack is also read-write, which means the code can know which function called it and any saved local variables it may have, and then modify them. Trivially, a simple program can trawl the stack to look for the state of a copy it triggered and then adjust the source variable to point to the start of BIOS and read out the entire BIOS that way. Due to timing issues this can take a few tries to get right, but is fairly reliable and a sample implementation took less than two seconds properly dump the whole BIOS.

Further, if you investigate the stack in while running in the middle of an interrupt, you can dig deep enough back through the stack to find that it will return to code in the interrupt handler. And since the interrupt handler is in BIOS, this shows it’s actually possible to call code in the BIOS directly. If you know where it is. Since the interrupt handler has a separate stack, it’s pretty trivial to use this stack to find the interrupt handler is, but that’s about it.

However, another questionable design on Nintendo’s part is that the interrupt stack is actually adjacent in memory to the normal stack and has no protections. You can inspect that stack to find where the function is: near the bottom of the IRQ stack is the address to where the IRQ interrupted the copy. After you’ve recovered the address of the copy, you can jump directly into it and…jump right past the bounds checking to the inner loop. Just fiddle with the CPU state a bit to set it up to be copying the BIOS instead of any other region and set the length. This removes any of the timing guesswork from the previous approach, but requires a bit more trickery due to the difficulty of distinguishing which address is which.

These are two related, completely black-box approaches to dumping the BIOS, exploiting only Nintendo’s questionable approach to memory handling. There are no software vulnerabilities involved in this approach and require exactly zero prior knowledge of the layout or contents of the BIOS. A sample implementation of the stack modification approach and a hardcoded jump approach are implemented in my own BIOS dumping tool. And while having a prior dump of the BIOS made this process far easier, it was only a convenience, not a necessity. It shows that you don’t always need software vulnerabilities to exploit a system; hardware flaws run far deeper.