Last week I challenged myself to write a Tiny Basic interpreter in C, like one normally does. It was during this period that I wrote the Brainfuck interpreter (which you can find here) as something to take a break with, again, just a very normal thing to do. I don’t know what it is with interpreters and compilers that I always try to build them without ever completing any. I get very excited to build a new programming language or play around with an existing esoteric language that I start a project and start writing everything from scratch. This enthusiasm spike conveniently has the same period as that of vacations. This has lead to me having a couple of different lexers and partially completed parsers and code generators. This time, it wasn’t actually a challenge implementing the interpreter but actually completing it and in that regard I must say, I have made the most progress compared to my last 1623 attempts.

This implementation is far from perfect. I am not very confident about the parser, leads to a lot of seg faults, speaking of which, there is no error reporting. I guess that’s what you get for trying to write everything on your own, something I would never do again.

I reused a lexer I wrote for a language which I designed to be a subset of C. Similar syntax but without pointers (Yeah, not something one would use to build the next big OS). Tiny Basic has fewer keywords and a much simpler grammar. I had to only remove certain things from the old lexer to make it work with Tiny Basic. The lexer generates an array of Tokens

typedef struct Tokens { int type ; char * value ; } Token ;

where the value is the actual token and the type is something defined in the tokens.h file.

I found the grammar on Wikipedia.

line ::= number statement CR | statement CR statement ::= PRINT expr-list IF expression relop expression THEN statement GOTO expression INPUT var-list LET var = expression GOSUB expression RETURN CLEAR LIST RUN END expr-list ::= (string|expression) (, (string|expression) )* var-list ::= var (, var)* expression ::= (+|-|ε) term ((+|-) term)* term ::= factor ((*|/) factor)* factor ::= var | number | (expression) var ::= A | B | C ... | Y | Z number ::= digit digit* digit ::= 0 | 1 | 2 | 3 | ... | 8 | 9 relop ::= < (>|=|ε) | > (<|=|ε) | = string ::= " (a|b|c ... |x|y|z|A|B|C ... |X|Y|Z|digit)* "

Even though I initially started implementing the parser based on this grammar, later on I took some liberty in a few things like non-char variable names, omitting RUN, CLEAR and making END optional.

The parser is where all the magic takes place. It is a recursive descent parser (I have to mention that I have never done a compilers course so I didn’t look into the type of grammar before writing the parser) followed by evaluation. Before any parsing or evaluation takes place, there is a pass where all the labels are stored in an array where the index points to the line number (starts from 0). Not fancy, but gets the job done.

The symbol table is basically an array of structs.

typedef struct Symbols { char * variable ; int type ; int value ; } Symbol ;

Variables can store only ints for now. The symbol table is searched using linear search with the variable identifier as the key. Maps would have been really useful but I didn’t want to increase the number of lines any further. My eventual plan is to port it to work with NodeMCU/Arduino and when I get to it I might implement a map and make the code more efficient. For now, it just works.

Any assignment to a variable should be preceded by LET, even if the variable is already declared/initialized. Labels are optional and no syntax for loops yet, one has to make do with GOTO or GOSUB paired up with an IF condition for exiting the loop.

Stars in TB

I remember this being one of my first programs ever on a variant of Basic called RobotBasic. (PS - I am late nineties kid, Basic was not a big deal when I got to play with computers)

PRINT "How many stars do you want?" LET stars = 0 INPUT stars 10 IF stars == 0 THEN END PRINT "*" LET stars = stars - 1 GOTO 10

Fibonacci in TB

REM This code prints n fibonacci numbers LET a = 0 LET b = 1 LET n = 10 REM This is how loops work for now 100 PRINT a IF n == 0 THEN END LET t = b LET b = a + b LET a = t LET n = n - 1 GOTO 100

Infamous Hello World

10 PRINT "Hello World!" GOTO 10

You can find the code for the interpreter here.