Update (Oct. 5, 2010): updated the code to use the current version of Standard Library (with universe polymorphism); index the environment by typing context.

It has been a typical exercise for beginners to write an evaluator for some variation of λ calculus. In a recent meeting I talked to my assistants about using dependent type to guarantee that, in an evaluator for λ calculus using de Bruin indices, that variable look-up always succeeds. The basic idea is to represent the environment using the Vec datatype, and label each λ terms with the number of enclosing λ’s, which must match the length of the environment.

I thought it could also be an easy exercsie for them to write up such an evaluator. Yet again, things turned out to be just a little bit more harder than I expected, thus I am recording the code here for reference. I do not remember where I have seen similar code before, but I believe the following is more or less standard.

Evaluating Simply-Typed λ Calculus

Take simply-typed λ calculus with natural numbers as the only base type. It is common in Haskell to use such a tagged value type:

data Val = Num Int | Fun (Val -> Val)



However, Val is not a legistimate type in Agda, since recursion occurs in negative position in the case of Fun . Indeed, Val -> Val is a “larger” type than Val , and we would need a more complicated semantics to accommodate the former into the latter.

Oleg Kiselyov suggested defunctionalising Val . For example, represent the value as either

data Val = Num Int | Closure [Val] Term



or

data Val n = Num n Int | Closure [Val m] (Val (n+1))



I may try these alternatives later. At the time when I wrote the code, the right thing to do seemed to be switching to a tagless representation for values and integrate type checking into the term type.

The type Term Γ a represents a term that, in the typing context Γ , evaluates to a value having type a . Since we are about to use the de Bruin notation, the typing context is simply a vector of types:

Cxt : ℕ → Set1

Cxt n = Vec Set n



Note that it is a vector storing types (in Set1 ) rather than values (in Set ). Before Agda introduced universe polymorphism, one would have to use Vec₁ , a Set1 variation of Vec .

The type Term Γ a can then be defined by:

data Term {n : ℕ} (Γ : Cxt n) : Set → Set1 where

lit : ℕ → Term Γ ℕ

var : (x : Fin n) → Term Γ (lookup x Γ)

_·_ : ∀ {a b} → Term Γ (a → b) → Term Γ a → Term Γ b

add : Term Γ ℕ → Term Γ ℕ → Term Γ ℕ

ƛ : ∀ {a b} → Term (a ∷ Γ) b → Term Γ (a → b)



The type Fin n denotes natural numbers between 0 and n-1 . Notice that in a (de Bruin) variable var x , the value of x is bounded by n , the length of the typing context. The lookup function is thus guaranteed to success. The typing context is extended with one more element within each ƛ . The type Vec , its look-up function, and the type Fin for bounded natural numbers, are all defined in the standard library:

data Vec {a} (A : Set a) : ℕ → Set a where

[] : Vec A zero

_∷_ : ∀ {n} (x : A) (xs : Vec A n) → Vec A (suc n) data Fin : ℕ → Set where

zero : {n : ℕ} → Fin (suc n)

suc : {n : ℕ} (i : Fin n) → Fin (suc n) lookup : ∀ {a n} {A : Set a} → Fin n → Vec A n → A

lookup zero (x ∷ xs) = x

lookup (suc i) (x ∷ xs) = lookup i xs



Now the evaluator. We need an environment allowing us to store heterogeneous values. One possibility is to index the environment by typing contexts: given a typing context Γ of length n , Env Γ is essentially a vector of n values whose types match those in Γ :

data Env : {n : ℕ} → Cxt n → Set1 where

[] : Env []

_∷_ : ∀ {a n} {Γ : Cxt n} → a → Env Γ → Env (a ∷ Γ)



Agda allows us to reuse the constructors [] and _∷_ . For an example, 0 ∷ 1 ∷ (λ v → suc v) ∷ [] may have type Env (ℕ ∷ ℕ ∷ (ℕ → ℕ) ∷ []) .

We also need a function that looks up an environment:

envLookup : ∀ {n}{Γ : Cxt n} → (i : Fin n) → Env Γ → lookup i Γ

envLookup zero (v ∷ ρ) = v

envLookup (suc n) (v ∷ ρ) = envLookup n ρ



Notice the resulting type lookup i Γ — the type of the i th element in an environment is the i th type in the typing context.

With the setting up, the evaluator looks rather standard:

eval : ∀ {a n} {Γ : Cxt n} → (ρ : Env Γ) → Term Γ a → a

eval ρ (lit m) = m

eval ρ (var x) = envLookup x ρ

eval ρ (e₁ · e₂) = (eval ρ e₁) (eval ρ e₂)

eval ρ (add e₁ e₂) = (eval ρ e₁) + (eval ρ e₂)

eval ρ (ƛ e) = λ v → eval (v ∷ ρ) e



An Environment as a Vector of Type-Value Pairs

In my earlier attempt I tried to reuse the Vec type for the environment as well. After all, an environment can be seen as a list of (type, value) pairs:

Env : ℕ → Set1

Env n = Vec (Σ Set (λ a → a)) n



where Σ is a pair where the type of its second component may depend on the first. The two projection functions of a pair are respectively proj₁ and proj₂ . In Σ Set (λ a → a) , the type of the second component is exactly the first, that is, we may have environments like ρ = (ℕ , 0) ∷ (ℕ , 1) ∷ (ℕ → ℕ , (λ v → suc v)) ∷ [] . To extract the i th value, we simply apply proj₂ (lookup i ρ) , which by the definition of Σ has type proj₁ (lookup i ρ) . Applying map proj₁ to ρ gives us the “typing context” Env (ℕ ∷ ℕ ∷ (ℕ → ℕ) ∷ []) . The eval function now has type:

eval : ∀ {a n} → (ρ : Env n) → Term (map proj₁ ρ) a → a

In dependently typed programming, however, it is not always easy to reuse general purpose datatypes. In this case, the price for reusing Vec and lookup is an additional proof obligation in the var x case of eval :

eval : ∀ {a n} → (ρ : Env n) → Term (map proj₁ ρ) a → a

eval ρ (lit m) = m

eval ρ (var x) rewrite lookup-proj x ρ = proj₂ (lookup x ρ)

eval ρ (e₁ · e₂) = eval ρ e₁ (eval ρ e₂)

eval ρ (add e₁ e₂) = (eval ρ e₁) + (eval ρ e₂)

eval ρ (ƛ e) = λ v → eval ((_ , v) ∷ ρ) e



The reason is that the right-hand side, proj₂ (lookup x ρ) , has type proj₁ (lookup x ρ) , while the type checker, having just matched the var constructor, expects something having type lookup x (map proj₁ ρ) . We thus have to convince Agda that they are actually the same type:

lookup-proj : ∀ {n} (x : Fin n) (ρ : Env n) →

lookup x (map proj₁ ρ) ≡ proj₁ (lookup x ρ)

lookup-proj zero (_ ∷ _) = refl

lookup-proj (suc i) (_ ∷ xs) = lookup-proj i xs



Higher-Order Abstract Syntax

My original purpose was to demonstrate the use of vectors to guarantee static properties. If we decide not to use de Bruin notation and switch to higher-order abstract syntax instead, we could have a much simpler evaluator:

data Term : Set → Set1 where

lit : ℕ → Term ℕ

var : ∀ {a} → a → Term a

_·_ : ∀ {a b} → Term (a → b) → Term a → Term b

add : Term ℕ → Term ℕ → Term ℕ

ƛ : ∀ {a b} → (a → Term b) → Term (a → b) eval : ∀ {a} → Term a → a

eval (lit n) = n

eval (var v) = v

eval (e₁ · e₂) = eval e₁ (eval e₂)

eval (add e₁ e₂) = eval e₁ + eval e₂

eval (ƛ e) = λ x → eval (e x)



It is a bit odd to me that var and lit are actually the same thing. Nothing forbids you from writing λ(\x -> lit x) or λ(\x -> var 3) .

Further Reading

Oleg Kiselyov has written a lot on typed tagless interpreters. Besides the initial representation above, there is a dual final representation. The purpose is to experiment with staged interpreters. Their paper Finally Tagless, Partial Evaluated: Tagless Staged Interpreters for Simpler Typed Languages is a good read which I would recommend to people.

Code