Build a Tree-Shaking Utility in JavaScript

How to build your own “dead code” remover in JS

Tree-shaking is one of many optimization techniques we have. It entails removing code that is never used. In this post, we will demonstrate, with a simple JS code, how to build your own tree-shaking utility.

To skip the intro and head straight to the code, see the “Let’s build one” section below.

Image by mbll from Pixabay

Tip: Optimize teamwork by using the right tools for code-sharing

Use Bit to share, install and collaborate on individual JS modules and UI components. Stop wasting time configuring packages, managing multiple repositories or maintaining cumbersome monorepos.

Components with Bit: Easily share across projects as a team

Tree-shaking

Like literally shaking a tree, we do it to shake off dead leaves and ripe fruits on the tree, so unripe or budding leaves will have full nutrients all for them from the tree without sharing it with the ripe fruits.

Removing dead leaves and ripe fruits:

It leaves the tree clean and refreshed.

It creates space for new leaves and fruits to sprout.

It removes unwanted competition for nutrients.

It saves the tree from bending under the weight of the ripe fruits.

Bringing it to programming, tree-shaking involves removing dead code. Dead code is code that is declared but not used.

Here are a few examples:

FunctionDeclaration

We can declare a function but not use them, like this:

function add(a,b) {

return a + b

} function mul(a, b) {

return a * b

} var firstOp = 9

var secondOp = 10

log(add(firstOp, secondOp))

We declared two functions mul and add , but only add was used i.e called. So, what's mul function doing there if it is not used anywhere?

It will make our app bloated and slower to run.

ClassDeclaration

Classes can be declared but never used:

class Point {

constructor(x, y) {

this.x = x

this.y = y

} distance() {

// ...

}

} class Arithmetic {

square(num) {

return num ** num;

}

// ...

} const arittmetic = new Arithmetic(); for(var i = 2; i <= 100; i += 2) {

log(arithmetic(i))

}

Here we have two class declarations Arithmetic and Point , the former is used for performing any arithmetic functions and the latter for representing a location in a 2-D plane.

The class Point was not used throughout the code — only Arithmetic was used. Now, what is the point of having the Point class in the code without using it?

Anyway, you get the point, whether it’s functions, classes or variable, it makes no sense to leave them declared but not used.

The advantages of Tree-Shaking

Interpreter execution time

It takes the interpreter a considerable time to execute a piece of statement. Let’s say it takes the interpreter ~3ms to declare a class, ~1ms to declare a variable and ~2m to declare a function. We will see that we have spent useless time declaring something that will never be used. That time should have been spent on something worthwhile.

Download time

Useless codes make our app heavier to download to the browser —not a good user experience, especially for those with slow network connection.

We need to remove this useless code to save precious time. This is what tree-shaking does. A tree shaking utility analyzes your code before running it, to detect code that was never used and remove it prior to execution.

Let’s build one

How do we go about tree-shaking a piece of code?

Our tree-shaking will analyze our JS code and output a new code that has been tree-shaked. For example, using our first example:

// func-ex.js

function add(a,b) {

return a + b

} function mul(a, b) {

return a * b

} var firstOp = 9

var secondOp = 10

log(add(firstOp, secondOp))

When we feed the func-ex.js to our tree-shaking utility:

$ tree-shake func-ex.js

It will output a tree-shaked code:

// func-ex.js

function add(a,b) {

return a + b

} var firstOp = 9

var secondOp = 10

log(add(firstOp, secondOp))

The mul function is no longer there because it is useless. Now, we have shaken the func-ex.js file and the mul function has fallen off ;-).

Let's dive into technical details on how to build our tree-shaker. First, we have to represent the code in an Abstract Syntax Tree. Then, we will have a JSEmitter that will emit the JS code from the AST. But, before the JSEmitter emits any code, it will run through the AST twice.

Yea, that’s my design, if you have a smarter idea on how to do it feel free to comment.

The first run will be to gather the declarations:

Functions

Variables

Classes

and notes down any of the declarations that were called/used.

The second run will iterate through the called/used declarations, it will gather declarations that were used then merge them with the rest of the code. This shakes off dead code.

First, we will start with a JS emitter, this emits a JS code when given an ES AST node:

class JSEmitter { visitVariableDeclaration(node) {

let str = ''

str += node.kind + ' '

str += this.visitNodes(node.declarations)

return str + '

'

} visitVariableDeclarator(node, kind) {

let str = ''

str += kind ? kind + ' ' : str

str += this.visitNode(node.id)

str += '='

str += this.visitNode(node.init)

return str + ';' + '

'

} visitIdentifier(node) {

return node.name

} visitLiteral(node) {

return node.raw

} visitBinaryExpression(node) {

let str = ''

str += this.visitNode(node.left)

str += node.operator

str += this.visitNode(node.right)

return str + '

'

} visitFunctionDeclaration(node) {

let str = 'function '

str += this.visitNode(node.id)

str += '('

for (let param = 0; param < node.params.length; param++) {

str += this.visitNode(node.params[param])

str += ((node.params[param] == undefined) ? '' : ',')

}

str = str.slice(0, str.length - 1)

str += '){'

str += this.visitNode(node.body)

str += '}'

return str + '

'

} visitBlockStatement(node) {

let str = ''

str += this.visitNodes(node.body)

return str

} visitCallExpression(node) {

let str = ''

const callee = this.visitIdentifier(node.callee)

str += callee + '('

for (const arg of node.arguments) {

str += this.visitNode(arg) + ','

}

str = str.slice(0, str.length - 1)

str += ');'

return str + '

'

} visitReturnStatement(node) {

let str = 'return ';

str += this.visitNode(node.argument)

return str + '

'

} visitExpressionStatement(node) {

return this.visitNode(node.expression)

} visitNodes(nodes) {

let str = ''

for (const node of nodes) {

str += this.visitNode(node)

}

return str

} visitNode(node) {

let str = ''

switch (node.type) {

case 'VariableDeclaration':

str += this.visitVariableDeclaration(node)

break;

case 'VariableDeclarator':

str += this.visitVariableDeclarator(node)

break;

case 'Literal':

str += this.visitLiteral(node)

break;

case 'Identifier':

str += this.visitIdentifier(node)

break;

case 'BinaryExpression':

str += this.visitBinaryExpression(node)

break;

case 'FunctionDeclaration':

str += this.visitFunctionDeclaration(node)

break;

case 'BlockStatement':

str += this.visitBlockStatement(node)

break;

case "CallExpression":

str += this.visitCallExpression(node)

break;

case "ReturnStatement":

str += this.visitReturnStatement(node)

break;

case "ExpressionStatement":

str += this.visitExpressionStatement(node)

break;

}

return str

} run(body) {

let str = ''

str += this.visitNodes(body)

return str

}

} module.exports = JSEmitter

It holds a visit* for any AST node in the ESTree. These visit* methods produce a JS code when called with an AST tree.

Next we will build the shaker.js:

const acorn = require("acorn")

const l = console.log

const JSEmitter = require('./jsemitter')

const fs = require('fs') // pull in the cmd line args

const args = process.argv[2]

const buffer = fs.readFileSync(args).toString() const body = acorn.parse(buffer).body

const jsEmitter = new JSEmitter() let decls = new Map()

let calledDecls = []

let code = [] body.forEach(function(node) {

if (node.type == "FunctionDeclaration") {

const code = jsEmitter.run([node])

decls.set(jsEmitter.visitNode(node.id), code)

return;

} if (node.type == "ExpressionStatement") {

if (node.expression.type == "CallExpression") {

const callNode = node.expression

calledDecls.push(jsEmitter.visitIdentifier(callNode.callee)) const args = callNode.arguments

for (const arg of args) {

if (arg.type == "Identifier") {

calledDecls.push(jsEmitter.visitNode(arg))

}

}

}

} if (node.type == "VariableDeclaration") {

const kind = node.kind

for (const decl of node.declarations) {

decls.set(jsEmitter.visitNode(decl.id), jsEmitter.visitVariableDeclarator(decl, kind))

}

return

} if (node.type == "Identifier") {

calledDecls.push(node.name)

}

code.push(jsEmitter.run([node]))

}); code = calledDecls.map(c => {

return decls.get(c)

}).concat([code]).join('') fs.writeFileSync('test/test.shaked.js', code)

First, we extract the file we want to shake from the process.argv and read it from the filesystem. We pass the file contents using the acorn#parse API.

We have decls, calledDecls, and code.

decls holds all declarations function and variable. calledDecls holds the name of the declaration that was used. code holds the code of other AST that is not a declaration.

We looped through the body, which the AST generated by the acorn#parse method. In the forEach loop we have a callback function, inside it we checked to capture function and variable declarations, that we put in the decls Map.

Then, we checked for when the declarations are used by checking for CallExpression and Identifier. CallExpression denotes a function call, that we know it is calling a function declaration. An Identifier is referring to a variable declaration.

At last, we just generate the node JS code and push it to the code array.

After the forEach loop, we got the declarations of the used declarations from the decls Map, we used the concat method the merge the code array and declarations, then used join to join them in a string.

The resulting code is written to test.shaked.js .

Let’s shake some code

Creat a folder test and touch test.js, add the following code to it:

function add(a, b) {

return a + b

} function mul(a, b) {

return a * b

} var firstOp = 9

var secondOp = 10 add(firstOp, secondOp)

mul is a dead code and should be removed. We run the shake.js file passing test/test.js to it.

node shake test/test.js

test.shaked.js was generated in the test folder, it contains the shaked code of test.js:

function add(a,b){return a+b}

var firstOp=9;

var secondOp=10;

add(firstOp,secondOp);

See :)!, mul is absent!! our code was effectively shaken. Also, if you noticed must useless whitespaces was removed making our code more minified than the original. This will reduce the payload when delivered to our browser.

Conclusion

Tree-shaking is a powerful concept, most popular web tools do implement tree-shaking. Tools like rollup, webpack etc. Yes, our example here is simple but it shows at a basic level what all those complex algorithms from top tools actually do.

You can go from this simple implementation to a more stable and efficient one, this was made to just demo the concept of tree-shaking.

In our next post, we will look at tree-shaking imported modules.

Thanks!!

Learn More