The basic data-flow of compilation looks like this:

parse: turn the source text into something we can use as input for our compiler, usually an AST (abstract syntax tree) optimisation (optional): transform the input program into an equivalent one that's more efficient in time or space code generation: output a program in our target language, e.g machine code

My JS to C compiler will implement this as follows

parse: use esprima - this project is about compilation not parsing & 3. code generation + optimisation: a compiler written in TypeScript that outputs C

C as a target language

C is lower-level than JavaScript, but it still gives us a lot! We can use the following features in C to implement similar features in JavaScript:

control structures: if , for , while etc will map well between languages function calls: we can compile JS functions to C functions

However, there is lots we're going to have to write from scratch. For instance C lacks data-structures beyond statically sized arrays. We'll fill this in with what's called a 'runtime library' - a library we'll call from our compiled program. Lots of the behaviour we'll need to implement JavaScript cannot be implemented without runtime support:

garbage collection exception handling data-structures: dynamic arrays and objects prototype-based object system

A sketch

Lets manually walk the following JS code through our compiler tool-chain:

function fact(n) { return n < 3 ? n : n * fact(n - 1); } console.log(fact(5));

First we run esprima to parse the source and give us a syntax-tree:

{ "type": "Program", "body": [ { "type": "FunctionDeclaration", "id": { "type": "Identifier", "name": "fact" }, "params": [ { "type": "Identifier", "name": "n" } ], "body": { "type": "BlockStatement",

This will be input to js-to-c.ts , which would output a target C program. We have a ternary expression in our input program, here's a snippet from the function in the compiler that handles it. You can see we're implementing JS's ternary expression with C's if :

function compileConditionalExpression(node: ConditionalExpression, state: CompileTimeState) { // ... return `${testSrc}; JsValue* ${resultTarget.id}; if(isTruthy(${testResultTarget.id})) { ${consequentSrc} } else { ${alternateSrc} }` }

Once our compiler has run we're left with a valid C program (hopefully). The below is the compiled fact function. Some interesting things to look for: the recursive call, and how the result value from the ternary is threaded through the multiplication and return:

static JsValue *fact_1(Env *env) { JsValue *return_3; JsValue *left_6 = envGet(env, interned_2 /* n */); JsValue *right_7 = jsValueCreateNumber(3); JsValue *conditionalPredicate_5 = LTOperator(left_6, right_7); JsValue *conditionalValue_4; if (isTruthy(conditionalPredicate_5)) { return_3 = envGet(env, interned_2 /* n */); } else { JsValue *left_8 = envGet(env, interned_2 /* n */); JsValue *callee_10 = envGet(env, interned_11 /* fact */); JsValue *left_12 = envGet(env, interned_2 /* n */); JsValue *right_13 = jsValueCreateNumber(1); JsValue *call10Arg_0 = subtractOperator(left_12, right_13); JsValue *args_10[] = {call10Arg_0}; JsValue *right_9 = functionRunWithArguments(callee_10, args_10, 1, NULL); return_3 = multiplyOperator(left_8, right_9); } return return_3; return getUndefined(); }

You can also see calls out to our runtime library - e.g envGet() . This is part of the runtime as we can't rely on the C call-stack here. In JS, functions are closures. Therefore they need the environment they close over to live well beyond the lifetime of the call-stack that defined them - think of function values stored as global properties or as event-listeners.

To use the runtime library our output program use C's #include macro to include our runtime library's header files1, and get access to the definitions of the functions our library has defined elsewhere:

#include "../../runtime/environments.h" #include "../../runtime/exceptions.h" #include "../../runtime/functions.h"

In my next post, we'll look at our how our js-to-c.ts compiler takes a syntax tree and outputs C source code.

Next: Compiling JS control-flow to C

Previous: Compiling Interpreted Languages

Enjoy this? Subscribe to my RSS feed.