What Remains is a narrative adventure game for the 8-bit NES video game console, and was released in March 2019 as a free ROM, playable in emulator. It was created by a small team, Iodine Dynamics, over the course of two years of on and off development. It’s currently in the hardware phase as a limited batch of cartridges are being created from all recycled parts.

The game plays out over 6 stages, wherein the player walks around multiple scenes with 4-way scrolling maps, speaking to NPCs, collecting clues, learning about their world, playing mini-games, and solving simple puzzles. As the primary engineer on this project, I faced a lot of challenges in bringing the team’s vision to reality. Given the significant restrains of the NES hardware, making any game is difficult enough, let alone one with as much content as What Remains. Only by creating useful subsystems to hide and manage this complexity were we able to work as a team to complete the game.

Herein is a technical breakdown of some of the pieces that make up our game’s engine, in the hopes that others find it useful or at least interesting to read about.

The NES Hardware

Before diving into the code, some specs on what we’re working with. The NES is a game console released in 1983 (in Japan, 1985 in America). It contains an 8-bit 6502 CPU [+] that runs at 1.79 MHz. Since it runs at 60 frames per second, this gives about ~30,000 cpu cycles per frame, which is not very much, to calculate everything that needs to happen during the main gameplay loop.

In addition, the console has only 2048 bytes of RAM (which can be expanded to 10240 bytes using work ram, not done in our case). Also, it can address 32K of ROM at once, which can be expanded using bank switching (What Remains uses 512K ROM). Bank switching is a complex topic[+] that is not really dealt with at all in modern programming. To quickly summarize, the address space available to the CPU is smaller than the data contained in ROM, meaning entire blocks of memory are inaccessible until being manually swapped in. That function you were expecting to call? It’s not really there until its bank gets swapped in by invoking a bank switch command. If you don’t, calling such a function will crash your program.

All considered, the hardest part of making an NES game really is dealing with all of this at once. Optimizing one aspect of code, such as memory usage, can often impact something else, like perhaps CPU performance. Code has to be efficient yet and also maintainable. Typically, games are mostly programmed in assembly language.

Co2

However, this was not the case for us. Rather, a custom language was developed in tandem with the game. Co2 is a Lisp-like language, built on Racket Scheme, which compiles into 6502 assembly. This language was originally started by Dave Griffiths to build the What Remains demo, and I decided to stick with it for the full project.

Co2 makes it possible to write inline assembly when needed, but also has higher-level facilities that make certain tasks easier. It implements local variables that are efficient both in terms of ram consumption and access speed[+]. It has a very simple macro system that helps write readable, and also efficient code[+]. Most importantly, it makes representing data directly in the source a lot easier, thanks to lisp’s homoiconicity.

Though creating custom tools is pretty common in game development, making an entire programming language is a bit rarer. Regardless, we did so. It’s unclear whether the difficulty of developing and maintaining Co2 was worth the trouble, but there were definitely some benefits that helped us out. This post won’t be going into too much detail on how Co2 works (that should be its own future article), but the language will be mentioned here and there, as its usage is pretty deeply entwined with how development progressed.

Here’s an example of some Co2 code, which draws the background for a newly loaded scene before fading it in:

; Render the nametable for the scene at the camera position (defsub (create-initial-world) (camera-assign-cursor) (set! camera-cursor (+ camera-cursor 60)) (let ((preserve-camera-v)) (set! preserve-camera-v camera-v) (set! camera-v 0) (loop i 0 60 (set! delta-v #xff) (update-world-graphics) (when render-nt-span-has (set! render-nt-span-has #f) (apply-render-nt-span-buffer)) (when render-attr-span-has (set! render-attr-span-has #f) (apply-render-attr-span-buffer))) (set! camera-v preserve-camera-v)) (camera-assign-cursor))

Entity system

Any real-time game more complicated than Tetris has at its core an “entity system”. This is the functionality that allows for multiple independent actors which act simultaneously and are responsible for their own state. Though What Remains is by no means an action game, it still has screens full of independent actors with complex behavior: animating and drawing themselves, checking for collisions, and triggering dialog.

The implementation is pretty typical; a large array contains the list of entities in the scene, each entry contains that entity’s relevant data along with a type tag. The update function in the main gameplay loop iterates over each entity and dispatches on the type, performing whatever behavior it needs.

; Called once per frame, to update each entity (defsub (update-entities) (when (not entity-npc-num) (return)) (loop k 0 entity-npc-num (let ((type)) (set! type (peek entity-npc-data (+ k entity-field-type))) (when (not (eq? type #xff)) (update-single-entity k type)))))

Storing entity data, though, is more interesting. In total, the game has many unique entities, to the point that using up a lot of ROM could become a problem. Co2 really shines here, letting us represent each entity in a scene using a concise yet readable form, as a stream of key value pairs. Aside from things like starting position, nearly every key is optional, letting entities only declare what they need.

(bytes npc-diner-a 172 108 prop-palette 1 prop-hflip prop-picture picture-smoker-c prop-animation simple-cycle-animation prop-anim-limit 6 prop-head hair-flip-head-tile 2 prop-dont-turn-around prop-dialog-a (2 progress-stage-4 on-my-third my-dietician) prop-dialog-a (2 progress-stage-3 have-you-tried-the-pasta the-real-deal) prop-dialog-a (2 progress-diner-is-clean omg-this-cherry-pie its-like-a-party) prop-dialog-a (2 progress-stage-1 cant-taste-food puff-poof) prop-dialog-b (1 progress-stage-4 tea-party-is-not) prop-dialog-b (1 progress-stage-3 newspaper-owned-by-dnycorp) prop-dialog-b (1 progress-stage-2 they-paid-a-pr-guy) prop-dialog-b (1 progress-stage-1 it-seems-difficult) prop-customize (progress-stage-2 stop-smoking) 0)

Here prop-palette specifies the color palette used for the entity, prop-anim-limit sets the number of animation frames, and prop-dont-turn-around prevents the NPC from turning around if the player tries to speak to them from the opposite side. There’s also a couple of conditional flags set, which change how the entity acts as the player progresses through the game.

Now then, this kind of representation is very efficient for storing it in ROM, but it’s really slow to access at runtime, and would be far too inefficient for gameplay. So, when a player enters a new scene, all the entities for that scene are loaded into RAM, handling any conditions that might affect their initial state. But not every detail per entity can be loaded, as that would use more RAM than is available. Instead, the engine loads exactly what is needed for each entity, plus a pointer to its full structure in ROM, which is then dereferenced for situations such as handling dialog. This specific set of trade offs gives just the right level of performance.

Portals

What Remains has a lot of different locations, some outdoor scenes with scrolling maps, and many interior scenes that stay static. Moving from one to another involves detecting when the player reaches an exit, loading the new scene, then placing the player at the correct spawn point. When development was early, these transitions were described uniquely by the two scenes that were connected, for example “first-city” and “diner”, with some data in an if statement for where the doors were located in each scene. Figuring out where to place the player after they changed scenes simply involved checking where they were going and where they were coming from, and placing them outside the appropriate exit.

However, this approach fell apart once we started filling out the “second-city” scene, which connects back to the first-city in two different places. Suddenly, the pair of (origin, destination) didn’t work anymore. After thinking about it a bit, it became clear that what really mattered was the connection itself, which internally the game code calls a “portal”. The engine was rewritten to account for this change, which lead to a situation similar to entities. Portals could be stored as lists of key value pairs, and loaded at the start of a scene. Entering a portal could use the same position information as exiting it. It was also easy to add conditions, similar to what entities had, that could modify portals at certain parts of the game, like when doors open or close.

; City A (bytes city-a-scene #x50 #x68 look-up portal-customize (progress-stage-5 remove-self) ; to Diner diner-scene #xc0 #xa0 look-down portal-width #x20 0)

This also made it easy to add “teleportation points”, which were often employed after cinematic scenes when the player would be moved to a different scene depending upon what was happening in the story.

Here’s a teleportation for the start of stage 3:

; Jenny's home (bytes jenny-home-scene #x60 #xc0 look-up portal-teleport-only jenny-back-at-home-teleport 0)

Note the value look-up , that’s what would be the direction to “enter” this portal. Leaving the portal leaves the player facing the opposite direction; in this case, Jenny (our main character) ends up at her home facing in the down direction.

Text box

Drawing the text box turned out to be one of the hardest pieces of code in the entire project. The graphical constraints of the NES end up making it pretty tricky. To begin with, the NES only has a single layer for graphic data, so the background map needs to be erased to make room for the text box, and it also has to be restored as the text box closes.

Next, the palette for every single scene needs to contain white and black in order to draw the text, which put additional constraints on our artist. The text box also needs to be aligned to a 16×16 grid, in order to avoid color clashes with the rest of the background[+]. Drawing the text box in an interior scene, as shown above, is much simpler than outside, where the camera moves, since it then needs to take into account wrapping around graphics buffers, both horizontally and vertically. Finally, the pause message display is slightly modified from a standard dialog box, since it displays different information, but it uses much of the same code.

After numerous buggy rewrites, I eventually settled on an approach that splits the work into two steps. First, all the calculations are performed to figure out where and how to draw the text box, including code for handling all of the various edge cases, thereby isolating all this complexity into one location.

Next, the text box is drawn one row at a time, in a stateful manner that utilizes the calculations from the first step in order to keep the code relatively simple.

; Called once per frame as the text box is being rendered (defsub (text-box-update) (when (or (eq? tb-text-mode 0) (eq? tb-text-mode #xff)) (return #f)) (cond [(in-range tb-text-mode 1 4) (if (not is-paused) ; Draw text box for dialog. (text-box-draw-opening (- tb-text-mode 1)) ; Draw text box for pause. (text-box-draw-pausing (- tb-text-mode 1))) (inc tb-text-mode)] [(eq? tb-text-mode 4) ; Remove sprites in the way. (remove-sprites-in-the-way) (inc tb-text-mode)] [(eq? tb-text-mode 5) (if (not is-paused) ; Display dialog text. (when (not (crawl-text-update)) (inc tb-text-mode) (inc tb-text-mode)) ; Display paused text. (do (create-pause-message) (inc tb-text-mode)))] [(eq? tb-text-mode 6) ; This state is only used when paused. Nothing happens, and the caller ; has to invoke `text-box-try-exiting-pause` to continue. #t] [(and (>= tb-text-mode 7) (< tb-text-mode 10)) ; Erase text box. (if (is-scene-outside scene-id) (text-box-draw-closing (- tb-text-mode 7)) (text-box-draw-restoring (- tb-text-mode 7))) (inc tb-text-mode)] [(eq? tb-text-mode 10) ; Reset state to return to game. (set! text-displaying #f) (set! tb-text-mode 0)]) (return #t))

Once you get used to the lisp-isms, this code reads relatively sensibly.

Sprite Z-Layers

Finally, here's a little detail that doesn't affect the gameplay too much, but is a nice piece of flair that I'm personally proud of. The NES has only two graphical components: the "nametable" which is used for static, grid-aligned background, and "sprites" which are small 8x8 pixel objects that can be arbitrarily positioned. Things such as player characters and NPCs are usually created as sprites placed wherever they need to be on top of the nametable graphics.

However, the NES hardware also provides the ability to set a bit on sprites to place them entirely behind the nametable. This can give a cool 3-d effect without too much work.

The way this works is that the palette used for the current scene treats the color in position 0 specially: it is the global background color. The nametable is drawn on top of this, and sprites with this z-layer bit are drawn sandwiched between the other two.

Here's the palette for this scene:

So that dark grey all the way to the left is used as the global background color.

The layer effect works like this:

For most other games, the story ends there. However, What Remains goes one step further. Instead of placing Jenny entirely in front of or behind the nametable graphics, her character is split as needed between them. You see, the sprites are 8x8 units, and the full character graphic is composed of multiple sprites, between 3 and 6 depending on the animation frame. Each sprite can set a z-layer for itself, meaning some are in front of the nametable, and some are behind.

Here's an example of this effect in action:

The algorithm to achieve this is a bit tricky. First the collision data surrounding the player is inspected. Specifically, the tiles which the full character drawing may cover. In this diagram, red squares are solid tiles, while yellow tiles set the z-layer bit.

These are combined using various heuristics in order to create a "pivot point", and a bit mask of four bits. The four quadrants relative to the pivot correspond to the four bits, 0 means the player should in front of the nametable, while 1 means behind.

As the individual sprites are placed to draw the player, the position is compared to the pivot to determine the z-layer for that specific sprite. Some end up in the front layer, some in the back.

Conclusion

So there's a sampling of what's going on under the hood of our new modern-retro game. There's certainly more to the codebase, but what's been shown off here is a significant portion of what makes everything tick.

The biggest lesson I got from this project was the benefits to be gained from data driven engines. Multiple times, I replaced some custom logic using a table and mini-interpreter, and it made code easier to manage and understand.

That's all for now, hope you've enjoyed!

footnotes

+ technically the NES has a Ricoh 2A03, a variant of the 6502

+ in fact, this project has convinced me that bank switching / ROM management is the dominant design restraint of every NES project above a certain size

+ this is thanks to a "compiled stack", a concept that's used in embedded programming, though I had a hard time finding much literature about it. In short, build the entire call graph of your project, sort from leaf nodes to roots, assign to each node memory equal to it's needs + the max(children)

+ macros were added pretty late in development, and truthfully the feature isn't taken advantage of that much

+ for more information on NES graphics, see my series from a while back. Color clashes are caused by attributes, covered in Part 1