Friday, September 21, 2018





Around five years ago I wrote a Gameboy Color emulator in Go. It was a very frustrating, but rewarding experience that I’ve been dining out on in job interviews ever since.

However, as the passage of time progressed, it landed on the pile of mostly-done-but-not-finished projects and left largely abandoned. One might generously say, on hiatus. Well, until very recently that is.

That 5 year gap

You see, a few weeks ago Go 1.11 came out, and with it came the promise of experimental support for compiling Go code to WebAssembly. There’s nothing one likes more than experimental APIs so this got me thinking, what could I do to test out this new WASM target?

If only I had a decently sized project written in Go that wasn’t some trivial TODO list manager 🤔

Hello, old friend

Going back to old code is like looking at old photos of yourself. So young, so naive, questionable style. Much to my surprise though, compiling the project using the new WASM target actually worked.

As in, within 5 minutes of commenting out code related to GLFW/GL calls, there was something running in the browser. Obviously, not rendering anything to the page, but there was stuff printing to the developer tools console at least to indicate the emulator was running.

This absolutely blew my mind, here was some old code, written for a non browser environment in a language not supported by browsers, running in the browser. It was exciting enough for me to blast out furious torrent of commits.

The end result being gomeboycolor WASM edition (warning: ~5mb or so download), go on, try it out, load some ROMs on it that you’ve illegally downloaded .

Alternatively try out the demo below, click “start” to run the emulator and “stop” to stop it. The preinstalled ROM is a test suite used to test the CPU:

Reinventing the wheel

Running emulators in the browser isn’t new and I’d imagine some people have had some fun porting other emulators using emscripten. In fact someone is doing a WASM Gameboy emulator in AssemblyScript, so I’m definitely not the first.

There are some caveats and performance issues of this WASM implementation that I will explain below if you’re still interested in my long ramblings, but it mostly works.

Except in Google Chrome, that is. Oh boy there’s trouble there, we’ll get onto that. Firefox and Safari seem to perform reasonably well though. I’ve not tried Edge.



Admit it, you're just here for the screenshots

Draw an owl

No one ever talks about the journey, just the destination. But this time I’ll indulge myself a little and document a few challenges I found along the way. Maybe we can all learn something. Learn something about porting Gameboy Color emulators to WASM at least.

By the way, as a side note, if you want to know what programming your own Gameboy emulator is like, @Tomek1024 paints a pretty accurate portrait, I’m impressed he managed to do it in less than two months. Embarrassingly it took me about six. Okay maybe seven. It was 2013.

Back to WebAssembly…

The first real issue was, while my previous self had endeavored to make the code very modular, with individual packages for each hardware component (e.g. CPU, GPU, memory management unit etc) there were some bits of the code base that made a lot of assumptions about its environment. These issues only presented themselves at runtime with the WASM virtual machine throwing up.

The problems were namely anything that was doing stuff around the os package, so like opening files, querying filesystems and user info. My emulator expected to read ROM files, and save battery state to disk. The browser WASM environment is in a sandbox and won’t let you play outside it.

So the first round of refactoring was to move the ROM-loading and save-saving layers to the outer edges and use interfaces like io.Reader further in. As a professional Java developer I should have known better, even 5 years ago, but still.

So that’s lesson 1. Limit the scope of your environment to the bits of code that need to know. Abstract elsewhere

Graphics

That was the easy bit. But no one likes playing a Gameboy with no screen, so this is where the hacks started to creep in.

Every so often the emulator emits a 2D array (frame) of pixels. The real hardware does this at a rate of 59.7 frames per second.

Drawing an array of pixels seemed like a prime candidate for the HTML5 canvas API, constructing an ImageData object and repainting on every frame update. When trying this though, using the go syscall/js package, the performance was abysmal, eventually causing the browser tab to freeze. It appeared as though the having the emulator and UI on the same thread was causing a lot of contention.

Sounded like a job for threading to me, but WebAssembly doesn’t support threads (yet) so I needed to figure out something else.

Have faith in the workers

Which led me towards Web Workers.

Web Workers is a simple means for web content to run scripts in background threads. The worker thread can perform tasks without interfering with the user interface.

Perfect!

Initialising the emulator inside a worker and then using the postMessage API to post the frames to the user interface on every tick seemed a more promising approach.

There were still performance issues though, as I was sending an array of bytes out of WASM land into JS land on every frame call. While the postMessage API supports sending an extra parameter to ‘transfer’ ownership of large data sets…

Transferable objects are transferred from one context to another with a zero-copy operation, which results in a vast performance improvement when sending large data sets.

…it seemed tricky to do in Go code because the transferable parameter needs to be an ArrayBuffer which doesn’t seem possible *see edit with the API that syscall/js provides.

Instead, to workaround this, the first hack was born. Within my Go code, the array of pixels gets converted to a base64 string, which is sent to a global javascript function that issues the postMessage call. This yielded a much smoother experience in the user interface!



The hack in diagram form.

Go code:

func (s * html5CanvasDisplay) DrawFrame (screenData * types.Screen) { for y := 0 ; y < 144 ; y ++ { for x := 0 ; x < 160 ; x ++ { var pixel types.RGB = screenData[y][x] s.imageData[i] = pixel.Red s.imageData[i + 1 ] = pixel.Green s.imageData[i + 2 ] = pixel.Blue s.imageData[i + 3 ] = 255 i += 4 } } // hack screenData := base64.StdEncoding. EncodeToString (s.imageData) // call global function js. Global (). Call ( "sendScreenUpdate" , screenData) } Note the image data is a flattened array that represents the pixel grid

Javascript in the worker:

function sendScreenUpdate(bs64) { // decode base 64 back to byte array var bytes = base64js.toByteArray(bs64) var buf = new Uint8ClampedArray(bytes).buffer; // uses transferable on post message postMessage([ "screen-update" , buf], [buf]); } Decode the base64 and then send the data to the user interface

The update canvas function in the user interface:

worker.onmessage = function (e) { if (e.data[ 0 ] == "screen-update" ) { updateCanvas(e.data[ 1 ]); } } function updateCanvas(screenData) { var decodedData = new Uint8ClampedArray(screenData); var imageData = new ImageData(decodedData, 160 , 144 ); canvasContext.putImageData(imageData, canvas.width / 4 , 4 ); } Repaint the canvas context with the new frame

The curse of the workers



Moody

Unfortunately the web worker approach presented a whole new set of challenges. It was beginning to feel I was making a rod for my own back.

Gamers usually like to play games by interacting with them via key or button presses. As Web Workers are isolated, they don’t have access to all the good stuff you get in the browser like setting up handlers for keyboard events and so on. The only way you can communicate with them is by sending them letters via the postMessage call and hoping they read them.

So the next challenge was to start posting the keyboard updates back to the emulator.



The Circle

Go code:

var messageCB js.Callback messageCB = js. NewCallback ( func (args []js.Value) { input := args[ 0 ]. Get ( "data" ) switch input. Index ( 0 ). String () { case "keyup" : // tell emulator what key has been released i.KeyHandler. KeyUp (input. Index ( 1 ). Int ()) case "keydown" : // tell emulator what key has been pressed i.KeyHandler. KeyDown (input. Index ( 1 ). Int ()) } }) // receive messages from outside self := js. Global (). Get ( "self" ) self. Call ( "addEventListener" , "message" , messageCB, false ) Handling the messages received in the worker

Javascript code in the user interface:

let keydownhandler = function (e) { worker.postMessage([ "keydown" , e.which]) } let keyuphandler = function (e) { worker.postMessage([ "keyup" , e.which]) } document .addEventListener( "keydown" , keydownhandler); document .addEventListener( "keyup" , keyuphandler); Sending keyboard change events to the worker

This back and forth postMessage approach was working and soon ballooned into a some sort of faux protocol that

Allows users to load ROMs into the emulator by passing the byte array to the worker

Allows users to configure the emulator settings by passing configuration data to the worker

Allows the emulator to send battery save data to the UI thread, which in turn puts it in LocalStorage On emulator start, the save data is loaded from LocalStorage and sent to the emulator

Allows the emulator to send a diagnostic frame rate counter back to the user interface

I’ll be honest, it was a royal pain in the backside to write and is probably very fragile and prone to breakages. There would be more arrows on this diagram to express the to-me, to-you handshaking but frankly I’ve had enough of diagrams.



Imagine more arrows

Performance

That was the implementation side of things, there was a lot of deck chair rearranging elsewhere, but most of the effort was in getting stuff in and out of the emulator.

Performance was still a problem though, the framerate was choppy and the keyboard input mechanism was unreliable, with some presses being skipped, making the games feel spongy and not fun to play.

To try and isolate whether the problem was due to WebAssembly performance, I introduced a “headless” mode that stops sending the screen data on every draw frame call. This was to try and remove the whole web worker → UI dance from the equation and just see how well the emulator can run.

The following tests were performed using these browsers on OSX 10.13.6:

Chrome 69.0.3497.100

Firefox 63.0b4

Safari 12.0 (13606.2.11)

Using the Game Tetris DX, running for 60 seconds, these were the results:





As you can see, Safari runs a pretty smooth shop. Firefox was more erratic but the emulator kept on pausing every so often so that would account for the drops. Chrome, while a fairly straight line, didn’t even make it past 20fps.

So WASM performance, at least on Firefox and Safari seemed pretty reasonable.

Repeating the same test with headless mode turned off, there was a definite 10 frame or so performance penalty, probably owing to the base64 conversions and passing messages bit. Oddly Firefox didn’t seem to pause at all during this test.





The problem with the unreliable keypresses was still there though, sometimes the emulator just wasn’t responding at all to anything. My hunch at the time was, as there is no threading in WASM yet, there’s probably a lot of spinning plates going on around handling postMessage callbacks. I don’t know how the browser internals work on this one, but I’d imagine they have timers and stuff going on that poll for updates.

So the next logical step was to see about slowing the emulator down by locking the frame rate to a maximum fixed value. The reasoning being, a slower emulator might give the browser more headroom to do its thing. I chose a 30fps lock for this test





A marked improvement in stability! Plus, my hunch was right, the keyboard was much much more reliable. Chrome continued to be in the doldrums though and still wasn’t acknowledging input. Redoing the test with a frame rate lock of 25fps just about made the keyboard work for Chrome, but it made for pretty choppy visuals.

Conclusion

Good god, that was a long post. Sorry.

This little experiment made for a fun ride. I somewhat suspect the use of web workers in this manner is definitely not what that feature was designed for. It’s doubtful that many people would build video games in the browser in this way. A better approach might be to use WebGL, but my mental, physical and emotional strength is not there right now to open that can of worms.

The coolest thing about this, and maybe the promise of WASM in general, is I can send my emulator to my friends without having to worry about whether they have shared libraries on their system. I don’t have to spin up a bunch of infrastructure to build versions for different operating systems . It just works, and I’m sure it’s going to get better and better as WASM develops.

On the Chrome front? I don’t know why the performance just isn’t there for this use case. The biggest surprise to me was actually Safari, I’m a born and bred Firefox boy and Safari never really made it into my esteemed browser list. Good job Apple 👍

If I were to do this again I’d probably take a second look at wasmBoy to see how they’re doing it, it doesn’t look like they are using web workers, and are using HTML5 canvas to render the output so I’ve probably made a few missteps somewhere.

If you want to see the code, I have a few repos. A benefit of doing refactoring work was it allowed me to decouple the ‘frontend’ and ‘backend’ bits of the codebase. So anyone can write a frontend that handles the screen display and keyboard controls which hooks into the backend where the emulator logic is. The frontends I’ve written so far can be found here:

gomeboycolor-wasm - The WASM version described in this blog post

gomeboycolor-glfw - This is what my emulator was originally written for, and uses GLFW to render the screen to a window

gomeboycolor/_examples - For fun I wrote an example frontend that renders in the terminal. It’s playable to a degree!



Thank you for reading, happy WASM’ing.