Parsing

My parser is build on top of the result from Lexer that was introduced in the previous blog post. Just to recap quickly the result of lexing Javascript is array of tokens and looks like:

KFunction,

Identifier("name"),

LBrace,

Identifier("console"),

Dot

Identifier("log"),

LRound

String("hello"),

RRound,

Semicolon,

RBrace,

The parser was build using ECMAScript specification. ECMAScript is defined by Context free grammar with a few exceptions. Example rule for parsing white space looks like:

Every line covers alternative for what is considered to be a white space. In this case it’s tab, space, non-breakable space. These are called terminals, can be visualized as a leaf of tree. When terminal is reached we have to consume one token from lexer and token has to mach terminal if no we try another rule. Grammar additionally uses non-terminals. Non-terminals are place holder for another rule which they are going to be substituted with, these are nodes of the tree. For example in grammar:

S::

Aa A::

b

c

We define S and A as non-terminals and a, b, c as terminals. If we start in S first we rewrite non-terminal to either terminal b or c and then we continue by consuming a.