I've always had a fascination with compilers. As a Java geek, I'm also quite interested in the JVM. In order to learn a little more about both, and as a way to contribute to the open source world, I decided to implement a compiler for BASIC. So, jvmBasic consumes BASIC code and emits .class files.

The first step was to build a parser and lexer for BASIC. I decided to define an ANTLR4 grammar and use it to generate the lexer and parser. BASIC is a fairly simple language, so the grammar was not difficult to define. However, there are numerous BASIC dialects, so I had to pick a simple dialect. jvmBASIC syntax looks much like Integer BASIC, but could easily be extended to parse GW-Basic, or maybe VB. The resulting grammar is here.

Once ANTLR has generated a parser and lexer, it's possible to generate a parse tree for any BASIC input and then walk the tree emitting bytecode. I used ASM to emit the bytecode. An example BAS input file looks like this:

100 PRINT "Hello world"

The generated parse tree from jvmBASIC debug output looks like

- [1 line] - [3 linenumber] - [120 NUMBER] 100 - [4 amprstmt] - [5 statement] - [7 printstmt1] - [4 'PRINT'] PRINT - [8 printlist] - [66 expression] - [60 func] - [118 STRINGLITERAL] "Hello world" - [122 CR]

Because there is no concept of functions, methods or classes in BASIC, I chose to enclose the generated code in a single method, of a single class. The classname is the name of the BASIC input file, and the single method is:

public static void main(String[] args)

The class has two fields:

public InputStream inputStream; public OutputStream outputStream;

The default values of inputStream and outputStream are System.in and System.out respectively. However, in the case of jvmbasicwww, I replace them with HTTP input and output streams.

BASIC doesn't have new, delete, malloc, or free, or really any analogue of those. Additionally, methods such as MID$ or perhaps VAL have certain semantics and behaviour. In order to as closely as possible emulate BASIC, I implement jvmbasicrt. Inside jvmbasicrt are implementation of each BASIC function, as well as a class called ExecutionContext. ExecutionContext includes the "guts" of a BASIC runtime:

A stack. Similar to many programming languages, BASIC needs a stack.

All variables. This is simple a hashtable of Values, keyed on the Variable name.

Additionally there is Value which implements a variable with BASIC semantics.

There is a maven mojo which wraps jvmbasicc. The mojo jvmbasicmojo, compiles all BASIC files in "/src/main/bas" and produces a .class file for each one. This mojo can be used to incorporate BASIC files into any normal maven project and then link them into a .jar file.

An additional example BASIC file is:

10 REM this is a comment 20 PRINT "13" 30 PRINT "hi" 40 PRINT 10 50 PRINT 15.55 60 LET x = 12 70 PRINT "hihi" 80 PRINT x 90 LET y = 1+2 100 LET z = 3*6 110 LET d= y+z 120 PRINT d

The maven pom file that uses jvmbasicmojo is here.

The javap output for the generated .class file is:

public class EXAMPLE1 { public com.khubla.jvmbasic.jvmbasicrt.ExecutionContext executionContext; public java.io.InputStream inputStream; public java.io.PrintStream outputStream; public EXAMPLE1(); public static void main(java.lang.String[]); public void program() throws java.lang.Exception; }

There isn't a big demand, that I'm aware of, for bytecode compilers for BASIC. Two potential applications that come to mind are: