If you’re a programmer, there’s a good chance you noticed the node.js left-pad fiasco of a few weeks back that temporarily broke most of the npm ecosystem.

This blog doesn’t have an opinion on any of that. However, in the ensuing kerfluffle, several peopleobserved that left-pad appears to be quadratic, and that is definitely this blog’s concern.

If we look at the code, we see that the main loop looks like so:

while (++i < len) { str = ch + str; }

In most languages’ string implementations, this would be definitely quadratic — + copies the entire string (of size O(len)) on each iteration, and this loop runs O(len) times.

However, I happen to know that modern JS implementations have a whole pile of complex String optimizations, including rope-like data structures. And this is an evidence-driven blog, so I decided to take a deeper look. And what I found is fascinating and bizarre and I honestly can’t explain it yet.

The benchmarks

But let’s start with something easy. I ran a benchmark in rhino, which I judged to be the javascript implementation that I had easy access to least likely to have super-advanced string optimizations. We can clearly see the telltale quadratic curve:

But now let’s try something a bit more sophisticated, which also happens to be my primary browser: Chrome. Running a left-pad benchmark in Chrome yields the following result:

There’s bit of anomalous behavior, especially at small sizes, but it makes a compelling case for being linear out through the 1MB limit I ran the test over!

(I’m running Chrome Version 50.0.2661.57 beta (64-bit); your results may well vary with Chrome version!)

And in fact, the Chrome developer tools will let us see the rope structure that Chrome has used to make these concatenations efficient! If we leftpad('hi', 100000, 'x') in a Chrome console and then take a heap snapshot, we can see that the string is a “concatenated string” made up of a huge number of chunks linked together:

(The downside of this optimization, as you can also see from that screenshot, is that our 100kb string now consumes 4MB of RAM…)

Moving on, we can also try Safari, another modern browser with an incredibly-sophisticated Javascript engine. In Safari, we see this odd behavior:

It’s a bit hard to be sure, but the data appears to fit two linear fits, with a cutover somewhere around 350k. This pattern was reproducible across multiple experiments, but I don’t have an explanation.

I also decided to try node, since that is the actual runtime that NPM primarily targets, after all. node runs the same v8 engine as Chrome, so we’d expect similar behavior, but version skew, configuration, or who knows what could cause divergence. And in fact, we see:

Oddly, it also seems to exhibit two separate linear regimes!

At this point, I’m out of energy for what was meant to be a short post, but if anyone can follow-up and explain what’s happening, I’d be terribly curious.

In Conclusion