Implementing LISP-like language in Delphi

This is a translation of the post I wrote a couple of years ago in Russian.

I was working on my homework for Programming Languages course at Coursera (absolutely fantastic course, by the way, I even managed to get a statement of accomplishment) and decided to try to build an interpreter of a simple programming language in Delphi. This language has a LISP-like semantics.

Each language element is an expression, that can be evaluated. It other words, this language doesn’t have procedures or statements, functions only.

The most primitive expression is a number. It evaluates to itself.

(Number 5) -> (Number 5)

Other expresions are more useful. For instance, Add function works this way:

(Add (Number 2) (Number 3)) -> (Number 5)

or

(Add (Number 2) (Add (Number 1) (Number 3))) -> (Number 6)

Obviously, there can be any degree of function nesting and before a function is evaluated its parameters values have to be evaluated first. We are dealing with a tree and its evaluation has to be recursive.

In Delphi this expression tree can be represented as a tree of objects that implement the following interface.

IExpression = interface function Evaluate: IExpression; end;

So let’s implement classes for expressions above (I will not dig deeper in the details, you can download the full source code).

constructor TNumber.Create(AValue: Integer); begin inherited Create; FValue := AValue; end; function TNumber.Evaluate: IExpression; begin Result := Self; end;

Number expression stores its value and evaluates to itself.

Add expression is a bit more complicated. This class takes two expressions, evaluates them and then calculates the sum.

constructor TAdd.Create(AValue1, AValue2: IExpression); begin inherited Create; FExprs.Add(AValue1); FExprs.Add(AValue2); end; function TAdd.Evaluate: IExpression; var Expr1, Expr2: IExpression; Val1, Val2: IHasValue; begin Expr1 := FExprs[0].Evaluate; Expr2 := FExprs[1].Evaluate; if Supports(Expr1, IHasValue, Val1) and Supports(Expr2, IHasValue, Val2) then Result := TNumber.Create(Val1.Value + Val2.Value) else raise EExprException.Create('Invalid expression applied to TAdd'); end;

IHasValue interface is used to check if an expression has a value and to get the value.

IHasValue = interface ['{567A6313-3ABE-4620-9560-64F93BC4979A}'] function GetValue: Variant; property Value: Variant read GetValue; end;

I use Variant because I may decide to support other types (other than numbers) in the future.

Also I have implemented factory-functions. They do nothing special, just make my life a bit more convenient :)

function Number(AValue: Integer): IExpression; begin Result := TNumber.Create(AValue); end; function Add(Expr1, Expr2: IExpression): IExpression; begin Result := TAdd.Create(Expr1, Expr2); end;

And for debugging purposes I added AsString property to IExpression interface. It returns object’s representation as a string.

For instance,

function TNumber.GetAsString: string; begin Result := Format('(%s %d)', [Self.ClassName, FValue]); end;

TNumber, TAdd and other classes in this post have this method.

OK, finally that should be enough to evaluate one of the examples above.

var Test: IExpression; begin Test := Add(Number(2), Add(Number(1), Number(3))); Edit1.Text := Test.AsString; Edit2.Text := Test.Evaluate.AsString; end;

After executing this code Edit1.Text value is

(TAdd (TNumber 2) (TAdd (TNumber 1) (TNumber 3)))

and Edit2.Text value is

(TNumber 6)

Bingo! :)

This is cool, but absolutely useless.

What’s the difference between a real programming language and this little demo? Real programming languages support variables and scopes. Abstract syntax tree is not enough. Syntax tree has to be executed in some environment.

Let’s think about variables and scopes. What is a variable? It has a name and a value. In other words, it is a name and some associated expression (in this language everything is expression).

That’s why I need IEnvironment interface.

IEnvironment = interface function GetValue(const AName: string): IExpression; function SetValue(const AName: string; AExpr: IExpression): IEnvironment; end;

Everything is clear with GetValue method, but I will pay a little bit more attention to SetValue. When we declare a variable, it has to be visible to the current tree node and to its childs, but not for parent node or above. Therefore, in order not to spoil the environment of caller, when declare a variable we essentially create a new copy of the environment and put it into its own independent life within the current scope.

function TEnvironment.SetValue(const AName: string; AExpr: IExpression): IEnvironment; var EnvPair: TPair<string, IExpression>; NewEnv: TEnvironment; begin NewEnv := TEnvironment.Create; for EnvPair in FEnv do NewEnv.FEnv.Add(EnvPair.Key, EnvPair.Value); NewEnv.FEnv.AddOrSetValue(AName, AExpr); Result := NewEnv; end;

It would be better to not make a full copy of environment, but I want to make it as simple as possible.

Because now an environment must be taken into account when an expression is evaluated, IExpression interface has to slightly change.

IExpression = interface function GetAsString: string; function Evaluate: IExpression; overload; function Evaluate(Env: IEnvironment): IExpression; overload; property AsString: string read GetAsString; end;

Evaluate without a parameter just runs the evaluation within the empty environment.

function TExpression.Evaluate: IExpression; begin Result := Evaluate(TEnvironment.Create); end;

Now we are ready to implement variables.

constructor TVariable.Create(AName: string); begin inherited Create; FName := AName; end; function TVariable.Evaluate(Env: IEnvironment): IExpression; begin Result := Env.GetValue(FName); end;

This class stores variable name and return an associated expression from the current environment.

Well, you may notice that now this language has variables, but doesn’t have a syntax to declare them. Let’s implement LISP-like function let.

(let [varname varvalue] body)

This function binds a name (varname) and expression (varvalue) and then evaluates the body in a just created new environment. If it’s not clear, just take a look at code below.

constructor TLet.Create(const AVarName: string; AVarValue, ABody: IExpression); begin inherited Create; FVarName := AVarName; FExprs.Add(AVarValue); FExprs.Add(ABody); end; function TLet.Evaluate(Env: IEnvironment): IExpression; var VarValue: IExpression; begin VarValue := FExprs[0].Evaluate(Env); Result := FExprs[1].Evaluate(Env.SetValue(FVarName, VarValue)); end;

Here the value of a variable is calculated within an external environment and the body expression is calculated within a just created new environment.

It is a good time for a small test.

var Test: IExpression; begin Test := Let('N', Number(5), Add(Variable('N'), Add(Number(1), Number(3)))); Edit1.Text := Test.AsString; Edit2.Text := Test.Evaluate.AsString; end;

Edit1.Text value is

(TLet [N (TNumber 5)] (TAdd (TVariable N) (TAdd (TNumber 1) (TNumber 3))))

Edit2.Text value is

(TNumber 9)

Nice :)

But that’s not all. Useful language must support functions. Let’s think about them. For simplicity, suppose that the function will have only one parameter. Actually, it does not impose any restrictions on the language, but that’s not important right now.

First, the functions need to be declared and, secondly, to be called. These are two different actions. Let it be TDefineFunc and TCallFunc.

Obviously, the function is an expression. That is, those variables that we have – it is a function without parameters. The difference is in the fact that the value of the variable is evaluated immediately, but the evaluation of the function at the moment of its declaration makes little sense. Another important caveat is that the function must be evaluated in the environment in which it is declared (plus the parameter value, of course), but not within the environment in which it is called. This is so-called lexical scope – the approach adopted in most programming languages.

This leads us to a simple thought. The result of TDefineFunc evaluation should be an object that contains a function’s expression body, its environment and, of course, the name of the parameter. And then evaluating TCallFunc we assign the parameter value and evaluate the body. Let’s call the object returned by TDefineFunc – TClosure.

constructor TClosure.Create(AFunc: IExpression; AEnv: IEnvironment; const AParamName: string); begin inherited Create; FEnv := AEnv; FParamName := AParamName; FExprs.Add(AFunc); end; function TClosure.EvaluateClosure(AParamValue: IExpression): IExpression; begin Result := FExprs[0].Evaluate(FEnv.SetValue(FParamName, AParamValue)); end;

As I said above, it is aware about the environment, the parameter name, and also has a link to the body of the function. The Evaluate method in this case is different from other classes because it uses the previously saved environment with an addition of a parameter value.

Thus, TDefineFunc looks simple enough.

constructor TDefineFunc.Create(const AParamName: string; ABody: IExpression); begin inherited Create; FParamName := AParamName; FExprs.Add(ABody); end; function TDefineFunc.Evaluate(Env: IEnvironment): IExpression; begin Result := TClosure.Create(FExprs[0], Env, FParamName); end;

And TCallFunc is a bit more complicated.

constructor TCallFunc.Create(const AFuncName: string; AParamValue: IExpression); begin inherited Create; FFuncName := AFuncName; FExprs.Add(AParamValue); end; function TCallFunc.Evaluate(Env: IEnvironment): IExpression; var FuncExpr, ParamVal: IExpression; Closure: IClosure; begin ParamVal := FExprs[0].Evaluate(Env); FuncExpr := Env.GetValue(FFuncName); if Supports(FuncExpr, IClosure, Closure) then Result := Closure.EvaluateClosure(ParamVal) else raise EExprException.Create('Invalid expression applied to TCallFunc'); end;

TCallFunc takes a function name and a parameter, looks for an appropriate TClosure bound to a name in the environment, and then evaluates the value, passing the argument.

One more test.

var Test: IExpression; begin Test := Let('Add2', DefineFunc('N', Add(Variable('N'), Number(2))), CallFunc('Add2', Number(3))); Edit1.Text := Test.AsString; Edit2.Text := Test.Evaluate.AsString; end;

Here we declare a function with parameter N that returns N+2, bind it to Add2 name, and then call it with parameter equal 3. The result should be equal 5, I guess.

Edit1:

(TLet [Add2 (TDefineFunc N (TAdd (TVariable N) (TNumber 2)))] (Add2 (TNumber 3)))

Edit2:

(TNumber 5)

Very nice :)

This language can do a lot :) But I must confess that my initial goal was to write at least a calculation of factorial. In current language implementation it’s not possible. Why? Because it does not support recursion. At the time the function is declared there is no information about itself in the current environment, it appears in the environment only after that. That is, having only a copy of the environment before the function is declared, the function cannot call itself.

That’s why I slightly changed the evaluation of Let. Looks like a hack, you can do better.

function TLet.Evaluate(Env: IEnvironment): IExpression; var VarValue: IExpression; Closure: IClosure; begin VarValue := FExprs[0].Evaluate(Env); if Supports(VarValue, IClosure, Closure) then Closure.Env := Closure.Env.SetValue(FVarName, VarValue); Result := FExprs[1].Evaluate(Env.SetValue(FVarName, VarValue)); end;

We add a link to its name and its body to TClosure’s saved environment.

We are ready for the final test. I’ve added a few more functions (TSub, TMul, TEquals and TIfThenElse). They are pretty straightforward.

function TSub.Evaluate(Env: IEnvironment): IExpression; var Expr1, Expr2: IExpression; Val1, Val2: IHasValue; begin Expr1 := FExprs[0].Evaluate(Env); Expr2 := FExprs[1].Evaluate(Env); if Supports(Expr1, IHasValue, Val1) and Supports(Expr2, IHasValue, Val2) then Result := TNumber.Create(Val1.Value - Val2.Value) else raise EExprException.Create('Invalid expression applied to TSub'); end; function TMul.Evaluate(Env: IEnvironment): IExpression; var Expr1, Expr2: IExpression; Val1, Val2: IHasValue; begin Expr1 := FExprs[0].Evaluate(Env); Expr2 := FExprs[1].Evaluate(Env); if Supports(Expr1, IHasValue, Val1) and Supports(Expr2, IHasValue, Val2) then Result := TNumber.Create(Val1.Value * Val2.Value) else raise EExprException.Create('Invalid expression applied to TMul'); end; // Returns 1, if expressions are equal or 0 otherwise function TEquals.Evaluate(Env: IEnvironment): IExpression; var Expr1, Expr2: IExpression; Val1, Val2: IHasValue; begin Expr1 := FExprs[0].Evaluate(Env); Expr2 := FExprs[1].Evaluate(Env); if Supports(Expr1, IHasValue, Val1) and Supports(Expr2, IHasValue, Val2) then begin if Val1.Value = Val2.Value then Result := TNumber.Create(1) else Result := TNumber.Create(0); end else raise EExprException.Create('Invalid expression applied to TEquals'); end; // IfThenElse e1 e2 e3 // Returns e2, if e1 > 0, or e3 otherwise function TIfThenElse.Evaluate(Env: IEnvironment): IExpression; var Expr1: IExpression; Val1: IHasValue; begin Expr1 := FExprs[0].Evaluate(Env); if Supports(Expr1, IHasValue, Val1) then begin if Val1.Value > 0 then Result := FExprs[1].Evaluate(Env) else Result := FExprs[2].Evaluate(Env); end else raise EExprException.Create('Invalid expression applied to TIfThenElse'); end;

And the test itself. Let’s calculate factorial of 10.

var Test: IExpression; begin Test := Let('Factorial', DefineFunc('N', IfThenElse(Eq(Variable('N'), Number(0)), Number(1), Mul(Variable('N'), CallFunc('Factorial', Sub(Variable('N'), Number(1)))))), CallFunc('Factorial', Number(10))); Edit1.Text := Test.AsString; Edit2.Text := Test.Evaluate.AsString; end;

The Edit2.Text value is “(TNumber 3628800)”. Very very nice :)

The next step is to allow the user to write something like:

Let(Factorial, DefineFunc(N, IfThenElse(Eq(N, 0), 1, Mul(N, Factorial(Sub(N, 1))))), Factorial(10))

to parse the text automatically, build a syntax tree and evaluate. Parsing of this sort of syntax is quite simple. You can go ahead and make the syntax more friendly. Personally, I would prefer to read a function like so:

let fun Factorial N = if Eq(N, 0) then 1 else Mul(N, Factorial(Sub(N, 1)) do Factorial(10) end

It looks different, but semantically it’s still the same. Anyways this is a topic for another post.

And finally you can download the full source code.