How various devtools make use of Abstract Syntax Trees

In our previous blog post, we introduced Abstract Syntax Trees. We hinted at the fact that ASTs are a data structure that is at the core of many tools we use every day as developers. If you’ve been wondering what these tools are and how do they use ASTs, you’re in the right place: keep reading!

Running your program

Whenever you want to run your program, you typically have it compiled or interpreted, depending on the programming language of choice.

In the vast majority of cases the first step a compiler (or an interpreter) does is to transform your source code into an AST. This transformation is called parsing. Why is this a first and crucial step, you ask? The simple reason is that working with lines of text is way more complex than working with a tree structure.

This use case is a good motivation for having ASTs, but we can do so much more. Let’s see what’s next.

Transpiling your code

In some cases, you may want to transform your source code into another kind of source code. This process is called source-to-source compiling or transpiling. Transpilation is very common in the JavaScript universe: any language that can output valid JavaScript programs is transpiling the source code into some version of JavaScript. Notable examples are TypeScript, Scala.js, Bucklescript and many more. There are also tools that transpile across JavaScript versions (for instance from ES2017 to ES5). Famous examples are the Google Closure Compiler and Babel.

Transpilation is usually done by parsing the original source code, transforming the resulting AST and converting the transformed AST back into source code (a process usually called pretty printing).

Babel workflow: transpiling code from any version of JavaScript down to ES5

The interesting thing about tools like Babel is the ability to provide them with custom plugins to perform our own transformations. Let’s write our own Babel plugin that does a very trivial transformation: add 1 to each numeric literal.

You can run the example in ASTExplorer clicking here.

This is it: even if you’re not familiar with the API, the code above should be pretty clear. We’re targeting all nodes of type NumericLiteral and we’re altering their value.

TIP: the object representing the kind of nodes we want to visit and transform is usually called visitor: this is an extremely common API in any library that provides ways of traversing trees.

If you’re using Babel, learning how to write simple manipulations could allow you to simplify your team’s work and potentially avoid some common mistakes. For a list of custom Babel plugins, take a look at https://github.com/babel/awesome-babel

Linting your code

Linters are a great way of enforcing a standard coding style across a team of developers and for avoiding potential mistakes. An example of a popular linter in JavaScript is ESLint.

How do linters work? Unsurprisingly, they take advantage of ASTs too. The structure of your program is statically analyzed and, instead of producing a new tree, the linter will emit warnings/errors according to its configuration.

Linter workflow: analyze the AST and produce warnings/errors

As an example, here’s a simple rule that prevents the usage of console.log . You will recognize the familiar “visitor” API:

You can run the example in ASTExplorer clicking here.

Formatting your code

Formatters are tools that normalize the way your code is formatted, liberating developers from this boring task. Again, formatters work directly with ASTs: how? The unformatted source code is translated into an AST and then printed back according to some configured rules (for instance, how many characters are allowed per line, or whether to use semicolons or not). The focus is here is obviously on the right arrow, the pretty printing step.

The most popular formatting tool for JavaScript has an apt name: Prettier.

Formatter workflow: parse source code and print it back to a normalized version

Codemods

Codemods (automatic code refactors) are another very useful tool in the developer belt: they allow smooth migrations across versions of languages and libraries. Examples of this are the codemods distributed by Facebook along with breaking changes in the React API.

The most popular tools for codemods in JavaScript are recast and jscodeshift (which is a wrapper around recast).

As you can see in the diagram, the migrations happen at an AST level: the original source code is parsed into a tree, transformed, and printed back.

Notice how the workflow may look similar to the transpiler’s. There’s one big difference though: the focus here is to produce the smallest diff possible, in order to make the migration unobtrusive. That’s why tools like recast do their best in preserving the original source code whenever possible.

Codemod workflow: transform source code into source code

Conclusion

This concludes our tour of Abstract Syntax Trees and their applications. We saw how ASTs are at the core of most of the tools we use daily as developers. Learning how ASTs work is a very fundamental piece of knowledge that can be applied across tools and languages, and I strongly encourage you to dig deeper and experiment with these tools.

Happy devtooling :)

—

If you want to work in a place where we care about the quality of our development workflow, take a look at https://buildo.io/careers