High Performance JavaScript via Data Oriented Design 06 Sep 2015

This is a post I’ve been meaning to make for a while, since it’s a technique with a decent amount of value that I don’t see very frequently in JavaScript code, even in places when performance is a concern, so I wanted to talk about it (even though nobody reads this). Note that for the rest of this article, I’m going to assume you have a strong grasp of javascript, and won’t bother explaining trivial things about the language.

The technique is known in the C++ world as data orientation. You can find an excellent talk on the subject here. When applied widely, it has benefits to performance, and reability, especially when compared to other methods of code optimization.

The basic idea is to think of your program as a system that transforms data, and everything that happens as an aggregate operation.

That said, it’s not universally applicable, at least not in javascript, which gives you limited control over memory layout. It’s most beneficial in games, simulation, or number crunching code. If your bottleneck is string or DOM maniplation, as it is for a large number of websites, then this probably won’t help you. If you’re on node.js, you also probably don’t need to know anything about this advice.

The basic idea is to replace a list of javascript objects with one or more typed arrays, and a manager for those arrays. Let’s work through an example. I chose this example since it close to real world code, even if it might not be the simplest possible example.

function Spring ( target , zeta , omega ) { this . target = target ; this . value = target ; this . vel = 0.0 ; this . zeta = 0.0 ; this . omega = 0.0 ; } Spring . prototype . update = function ( dt ) { var f = 1.0 + 2.0 * dt * this . zeta * this . omega ; var oSqr = this . omega * this . omega ; var dtSqr = dt * dt ; var di = 1.0 / ( f + dtSqr * oSqr ); var dp = f * this . value + dt * this . vel * this . target ; var dv = this . vel + dt * oSqr * ( t - p ); this . value = dp * di ; this . vel = dv * di ; };

The assumption here is that Spring is a single object representing a springable number, and it’s probably stored in a javascript array full of other things with update(dt) methods. This is common, readable, and fine if you only have a few springs, say, less than a couple hundred.

However it fails to scale up to large numbers. You likely have more important things for your game to do than update springs, and you dont want that to dominate the runtime. So, we apply data orientation.

function SpringSystem ( maxSprings , zeta , omega ) { if ( maxSprings == null ) { maxSprings = 1024 ; } this . zeta = zeta ; this . omega = omega ; this . maxSprings = maxSprings ; // allocate one buffer, divide it into thirds for each of the arrays. var buf = new ArrayBuffer ( maxSprings * 3 * Float32Array . BYTES_PER_ELEMENT ); this . targets = new Float32Array ( buf , ( 0 * maxSprings ) * Float32Array . BYTES_PER_ELEMENT , maxSprings ); this . values = new Float32Array ( buf , ( 1 * maxSprings ) * Float32Array . BYTES_PER_ELEMENT , maxSprings ); this . velocities = new Float32Array ( buf , ( 2 * maxSprings ) * Float32Array . BYTES_PER_ELEMENT , maxSprings ); } SpringSystem . prototype . update = function ( dt ) { var f = 1.0 + 2.0 * dt * this . zeta * this . omega ; var oSqr = this . omega * this . omega ; var dtSqr = dt * dt ; var di = 1.0 / ( f + dtSqr * oSqr ); var vs = this . velocities ; var ps = this . values ; var ts = this . targets ; var springs = this . maxSprings ; for ( var i = 0 ; i < springs ; ++ i ) { var p = ps [ i ]; var v = vs [ i ]; var t = ts [ i ]; var dp = f * p + dt * v + dtSqr * oSqr * t ; var dv = v + dt * oSqr * ( t - p ); ps [ i ] = dp * di ; vs [ i ] = dv * di ; } };

I can crank the number of springs in this up to 80k before this function takes 1ms to run on my laptop, which is a testament to how good JavaScript VMs are when you give them tight loops, predicatble types, and don’t slow your code down with cache misses.

I’ve replicated a small example of it on codepen here, however the vast majority of the code is simply dealing with WebGL. A Canvas2D version here. (This version is slower due to not using WebGL, but might be a better demonstration of the topic, as the rendering is not what we’re talking about here).

Now, you might be wondering about the struct-of-array (SOA) style used above, that is, three arrays, each only containing elements of a given type. The alternative is to create one Float32Array of 3x the length, step over it by threes, and access elements at an offest of 0, 1, and 2. This is known as array-of-struct style (AOS). This isn’t bad, and I’ve certainly done it, but it’s harder to apply SIMD. That said, the construction/initialization of the arrays is a somewhat complex, so you would want to write helper code that wraps that complexity.

SOA isn’t always ideal (since JavaScript doesn’t have SIMD yet – if it did, you could do this loop 4 at a time, which would probably make up for its downsides), however in my experence separating these out is beneficial for readability and maintainability. As far as cache coherence is concerned, the difference isn’t enormously significant.

The big downside to the example I’ve shown here is adding unbounded amounts elements at runtime. You don’t need this as often as you might think, and can usually get away with clamping it at a max. That said, you do want it sometimes. You usually either want to grow the array (quite costly, but allows for arbitrary number of elements), or you want to use it as a ringbuffer, and write over the elements at the start (common for particle systems).

If I end up having time, I’ll try to show this in a future post.