Mozilla’s JavaScript optimizer, TraceMonkey, is pretty powerful. It carefully observes loops and converts them to super-fast assembly. We call this “tracing”.

That’s great and all, but there’s a problem: sometimes tracing doesn’t work. Loops can throw curveballs that cause tracing to stop. Especially with recursion, or lots of nesting, it can be very difficult to build good traces on complex code.

Other JavaScript engines, such as Nitro (present in WebKit/Safari), take a simpler approach. Instead of compiling loops to assembly, they compile entire methods (functions) to assembly. The generated code is much more generic than tracing, so while it is not as fast, it can handle any curveball.

What we’ve found is that when tracing works, we’re faster than the generic approach. But when tracing fails, we have to fall back to our old-school interpreter. At that point your JavaScript runs about as fast as it would in 2007-2008 (i.e. before Firefox 3.5, Safari 4, Chrome, etc).

That’s not acceptable, and we need to fix that. Trace compilation is still an active area of research (one which we’ll continue to work on) – but in the interim, we need to make sure our “slow path” is at least as good as the competition.

The question we’ve been asked, and we’ve been asking of ourselves, is: Why couldn’t we trace and keep going SUPER AWESOME FAST, and when tracing fails, fall back to STILL REALLY FAST?

Enter JaegerMonkey.

Our new project, JaegerMonkey (or JägerMonkey), has exactly this in mind. We’re taking the tried-and-true approach of other vendors, and bolting trace compilation on top. Once the two are interacting seamlessly, you’ll have a much more consistent – and fast – JavaScript performance experience.

Dave Mandelin, Luke Wagner, Julian Seward and I have been sprinting the past few weeks to get something basic working. To emit actual machine code, we’re using some very pretty classes (“macro assembler”) from Nitro. That’s been a real treat; it’s well-abstracted and C++ish, and allowed us to get to work on the actual compiler very quickly.

Our compiler is simple so far. Before interpreting a method, we translate each bytecode into some pretty generic assembly. For example, an “ADD” opcode will emit assembly that can handle both fast cases (adding two numbers) and slow cases (adding, say, an object and a string).

Contrast this to tracing, where the types are known, and pinned, statically – it does not need to handle any extra cases that might come up. In the whole-method compiler, the generated code must handle all unexpected variations in control or type flow.

After the function is compiled we execute it right away – the interpreter is skipped entirely.

Early Progress.

We’ve barely started and the results are already really promising. Running SunSpider on my machine, the whole-method JIT is 30% faster than the interpreter on x86, and 45% faster on x64. This is with barely any optimization work! When we integrate tracing next week, we’ll already start to see the benefits of both working together.

For a more in-depth study, Dave Mandelin has blogged about our early performance gains, what’s done, up-and-coming, etc.

As we move forward, the two compilers will be tightly integrated. The method compiler will be able to identify loops and invoke the trace compiler. The trace compiler, if it decides a method is too complex to inline, may decide to invoke the method compiler.

The future of SpiderMonkey is bright and shiny, and we’ll be talking more about the project as it reaches major milestones.

In the meantime, if you are interested in learning more, I invite you to look at JaegerMonkey on the Mozilla wiki, and our makeshift source code repository. We also hang out in #jsapi on irc.mozilla.org.