Disclaimer: All opinions thoughts / work were made personally by me and do not represent any of my employer’s thoughts or work.

Edit (12/21/18): Added Firefox performance bug / issue

Introduction

When developing for the web, there have been plenty of times where I couldn’t bring my idea to fruition due to browser performance. Browsers do not run instructions directly like a compiled executable written in C. Browsers have to download, parse, interpret, and Just-In-Time (JIT) compile JavaScript (JS / ES6). I’ve built more than a handful of Cordova / Ionic, Electron, and Progressive Web Apps (PWA) to allow myself to have the portability and flexibility of the web, but I knew it was always at the sacrifice of performance; So the moment I heard the whispers of WebAssembly (Wasm), I knew I had to jump in on it.

About a year ago, I started a new personal project called WasmBoy. WasmBoy is a Gameboy / Gameboy Color Emulator, written for WebAssembly, to help me learn WebAssembly. Gameboy emulation has been playable in browsers on mediocre desktop devices for a while now, but hardly playable in browsers on mobile devices. Therefore, one of the goals I had with WasmBoy was to bring playable Gameboy Emulation to budget mobile phones and Chromebooks. More importantly, with Wasmboy, I wanted to answer my question: “Will WebAssembly allow web developers to write almost as fast as native code for web browsers, and work alongside the ES6 code that we write today”. Which is a question that I think a growing number of other JavaScript developers have.

WasmBoy, at a high level, is organized in two sections. The “lib” (JavaScript API Interface) and the “core” (GameBoy Emulation “Backend”). The core of WasmBoy is written in AssemblyScript, which is a language that compiles TypeScript to WebAssembly using Binaryen. AssemblyScript is amazing. AssemblyScript allows Web developers to write more performant code in a new technology, using tools they are already comfortable with. WasmBoy is compiled to WebAssembly using the AssemblyScript compiler. However, if we take a step back, we can realize that we can mock out some of AssemblyScript’s global functions that we call within our TypeScript code base. Therefore, we can use the TypeScript compiler on the same code base that we use the AssemblyScript compiler with. Which gives us two different outputs in two different languages, using mostly the exact same source code! Using this process, I was able to make multiple cores: An Assembly script core, a JavaScript Core, and a JavaScript (Closure Compiled) core. These cores are compared in a WasmBoy Benchmarking tool that we will will get into greater detail later.

Other WebAssembly Benchmarks

There are a handful of other benchmarks out there that test WebAssembly vs. JavaScript performance. Commonly, you will find stack overflow questions that do a micro benchmark with wild results. This is due to the fact that WebAssembly offers over JavaScript isn’t a peak performance boost, but a consistent / predictable performance that can’t “fall off of the fast path,” like JIT compiling JavaScript can.

Another common benchmark found, is a comparison of the two different compiler outputs of Emscripten. Emscripten takes LLVM bytecode from C/C++ and compiles it down to asm.js or WebAssembly. Where asm.js is kind of a precursor to WebAssembly, it is a highly optimized subset of JavaScript intended to optimize JavaScript performance and not be written by day-to-day developers.

Colin Eberhardt, who runs WebAssemblyWeekly on Twitter, has a great response / TL;DR to one of the micro-benchmark stack overflow questions on the problems with micro benchmarking, and how Wasm should give about a 30% increase over asm.js in a real world case. Here is a link to the paper they are referring to for the Wasm performance increase claimed in the Stack Overflow response. Also, Colin has an A M A Z I N G talk on WebAssembly. The talk has a section that does a ton of comparisons of Wasm vs. Native vs. JS performance, and the talk illustrates this in much more detail than that response linked above.

In terms of other “Real world” WebAssembly Benchmarks, PSPDFKit has a great benchmarking tool and article on WebAssembly performance in a production application. I highly suggest giving that article a read as well if you are interested in this topic as it provides another point of view, and they did a great job comparing the two. However, the PSPDFKit benchmark does the comparison between WebAssembly and asm.js, and not WebAssembly and ES5/ES6. Therefore, the PSPDFKit benchmark is great if you are a developer with a large C/C++ application, and were wanting to know if moving from asm.js to WebAssembly is a great idea (which it is). Although, the PSPDFKit benchmark doesn’t really answer the question for JavaScript / Node developers on how WebAssembly will perform as a replacement of a computationally demanding piece of JavaScript code in their web application. Especially if these JavaScript / Node developers are learning a new language or platform to answer this question.

Gameboy Emulators make great benchmarks, and even the Chrome team used a Gameboy Emulator to benchmark browsers at some point. Game emulation in general stresses almost every part of a language / platform. Since it requires graphics, sound, controller input, and presents several interesting challenges such as performance, and flexibility. Emulation tends to be very computationally intensive, which makes it a great fit for WebAssembly. Also, WasmBoy is in the unique position to compare transpiled ES5 code from a popular compiler (TypeScript) to WebAssembly. Therefore, I thought WasmBoy would be a great fit for this type of benchmark. We mentioned before that asm.js is a faster subset of javascript, so let’s assume from this benchmark we should be notice a performance increase around 30% (1.3 times as fast).

WasmBoy Benchmarking Explained

As mentioned earlier, this benchmark will be utilizing the WasmBoy benchmarking tool (source code). The benchmark features three different cores as of today. AssemblyScript (WebAssembly built with the AssemblyScript compiler), JavaScript (ESNext output by the TypeScript compiler), and the previous JavaScript core except run through Google’s Closure Compiler that was built to optimize JavaScript to run faster. Each core is then imported by the benchmarking application using standard ES6 imports, and built into an IIFE using rollup.js.

The WasmBoy benchmarking tool works by loading each of the available WasmBoy core configurations, and then runs a specified number of frames of an input ROM / Game. The time it took to run each frame of the ROM is recorded in microseconds, using the npm package microseconds. This does not use the popular benchmark.js, since benchmark.js focuses more on running the same exact code multiple times. When benchmarking frame by frame, one frame we could be doing a ton of sound processing, and the next frame could just be moving around memory. Once we have all the times that it took to run each individual frame, we can process the data into other statistical values, and visualize on charts.

WasmBoy Benchmarking Setup

The benchmarking tool has some open source ROMs that can be run directly from the tool, or any GameBoy / GameBoy Color ROM can be uploaded to be tested. As mentioned before, every frame of a ROM is different, and so is every ROM! In our tests, we run the first 2500 frames of each ROM. However, we drop the first 10 percent of frames as It can greatly skew our data since In this benchmark JavaScript has a bit of time before “hot” code starts getting JIT compiled and then starts to level out at stable speed. For this test, we are running Tobu Tobu Girl, and Back To Color. Tobu Tobu Girl is a standard GameBoy game, and thus does a normal game intro. Tobu Tobu Girl does its title screen graphics, and sound effects here and there for about the first 1000 frames. Then, it switches into fully animated title screen with a full featured song. Back To Color is a GameBoy Color Demo, which are usually built to do cool effects, and push the limits of the system. Back To Color starts with a rapidly changing bass line, and color text that scrolls in. In about the last 1000 frames it shows an awesome cityscape with a continually complicated song. These are important to keep in mind, as sound is the most demanding part of WasmBoy, followed by graphics (where color is more complicated), and running standard CPU opcodes is the least demanding. Because of this, Back To Color should be slower than running Tobu Tobu Girl. And you will notice other ROMs can give greatly different results.

I then ran it on a variety of devices, and took screenshots (and merged them together into one large full page screenshot). The devices I tested on were:

I tested the benchmark on all major browsers, on the major browsers each device supported. The browsers we Chrome 70, Firefox 63.0.2, and Safari 12.1. I didn’t test on Edge because Microsoft recently announced Edge will be replaced with a Chromium based browser. Feel free to use the link to the tool mentioned at the beginning of this section to test on your own devices and their respective browsers.

Results

To keep the article shorter, we will only highlight and embed some of the results in this article. However, the images and results for all other configurations can be found in the WasmBoy repo. To interpret our results we will be using the “Sum” row in the tables to represent the performance of each core. The “Sum” represents that total time it took to run each frame added together. Also, we will be interpreting our results using a clear “X times as fast” format, as explained by this article on explaining performance improvements.

Desktop

For desktop, let’s take a look at the results of Back To Color of the 2015 MBP on Chrome, FireFox, and Safari. This is because, Back To Color is the more demanding of the two ROMs tested, the 2015 MBP is what I use to develop the emulator, and has support for all three major browsers.