Reading Time: 3 minutes

Earlier we discussed in our blog how to configure the ANTLR plugin for the intellij for getting started with our language.

In this post we will discuss the basics of the ANTLR and exactly how can we get started with our main goal. What is the lexer, parser and what are their roles and many other things. So lets get started,

scala, php then this is the thing that you are looking for. Antlr stands for ANother Tool for Language Recognition. The tool is able to generate compiler or interpreter for any computer language. If you need to parse languages like Java then this is the thing that you are looking for.

Here is the list of some projects that uses ANTLR.

This post begins with a small demonstration of ANTLR usefulness. Then, we explain what ANTLR is and how does it work. Finally, we show how to compile a simple ‘Hello word!’ language into an abstract syntax tree. The post explains also how to add error handling and how to test the language.

Overview

ANTLR is code generator. It takes grammer file(.g4 extension ) as input and generates two classes: lexer and parser, and visitor (if required). Lexer runs first and splits input into pieces called tokens. The stream of tokens is passed to parser which do all necessary work. It is the parser who builds abstract syntax tree, interprets the code or translate it into some other form. The code can be generated in Java, Python and many other languages as we have seen in the tutorial before Most importantly, grammar file describes how to split input into tokens and how to build tree from tokens. In other words, grammar file contains lexer rules and parser rules. Each lexer rule describes one token:

TokenName: regular expression;

Parser rules are more complicated. The most basic version is similar as in lexer rule:

ParserRuleName: regular expression;

They may contain modifiers that specify special transformations on input, root and childs in result abstract syntax tree or actions to be performed whenever rule is used. Almost all work is usually done inside parser rules.

Hello Word

We will create simplest possible language parser – hello word parser. It builds a small abstract syntax tree from a single expression: ‘Hello word!’. We will use it to show how to create a grammar file and generate ANTLR classes from it. Then, we will show how to use generated files and create an unit test.