When representing syntax we have at least two options for variables:

variable names, which are convenient to read and write but a nightmare to implement, or

De Bruijn indices, which avoid most of the issues of variable names, but really aren’t human-friendly.

Case in point:

“de Bruijn syntax is a Cylon detector” – @pigworker at BCTCS pic.twitter.com/17d9BJAoZe — Edwin Brady (@edwinbrady) April 27, 2017

Can we come up with an approach that combines the advantages of both?

I think so, with a strong enough type system and sprinkling of proof search.

This blog post is literate Idris.

> module Lambda > import Data.List

To keep it simple, we will focus only on untyped -calculus. Our idea is to define an inductive type family Term xs indexed by a list of free variables xs :

> data Term : List String -> Type where > App : Term xs -> Term xs -> Term xs > Lam : (x : String) -> Term (x :: xs) -> Term xs > Var : (x : String) -> {auto p : Elem x xs} -> Term xs

The constructors are application, lambda, and variables. For lambdas, we introduce a new variable which can then be free in the body of the lambda. For variables, we can use the variable name x as long as x is in the list of free variables xs . The {auto p ...} syntax tells Idris to perform a proof search, but you can always supply a proof manually.

The Elem x xs type is defined in Data.List as follows:

data Elem : a -> List a -> Type where Here : Elem x (x :: xs) There : Elem x xs -> Elem x (y :: xs)

A value of type Elem x xs tells us not only that x is in xs , but it also gives us the position of x in xs . Thus the Elem x xs argument represents the De Bruijn index of the variable. This serves two purposes:

It ensures that the variable name x is in the list of free variables xs . In the case where a name is repeated in xs , it also tells us which variable is meant. For example, Var "x" {p=Here} and Var "x" {p=There Here} are both valid terms in Term ["x","x"] , and they refer to two different variables, even though the variables have the same name.

So it’s nice to have an Elem x xs around. But we definitely don’t want to write values of Elem , so we ask the compiler to do it for us. We can be the humans, and let the compiler be the machine.

For example, here are the SKI combinators, with no indices in sight:

> termI : Term xs > termI = Lam "x" (Var "x") > > termK : Term xs > termK = Lam "x" (Lam "y" (Var "x")) > > termS : Term xs > termS = Lam "x" (Lam "y" (Lam "z" > (App (App (Var "x") (Var "z")) > (App (Var "y") (Var "z")))))

It looks nice. That’s the idea.

The rest of the post will be implementation details.

Shifting. Relaxing the type of a term by adding more free variables is akin to shifting in De Bruijn notation. These definitions rely on the library function

dropElem : (xs : List t) -> Elem x xs -> List t

which drops one element of a list. Shifting is defined as follows:

> undropElem : (q : Elem y ys) -> (p : Elem x (dropElem ys q)) -> Elem x ys > undropElem Here p = There p > undropElem (There q') Here = Here > undropElem (There q') (There p') = There (undropElem q' p') > > shift : (q : Elem y ys) -> Term (dropElem ys q) -> Term ys > shift q (App a b) = App (shift q a) (shift q b) > shift q (Lam x a) = Lam x (shift (There q) a) > shift q (Var x {p}) = Var x {p=undropElem q p} > > shift1 : Term xs -> Term (x :: xs) > shift1 = shift Here

Substitution. To define substitution, we first need a version of dropElem that also drops all the elements before the given element. Thus:

> dropManyElem : (ys : List t) -> (q : Elem y ys) -> List t > dropManyElem [] Here impossible > dropManyElem [] (There _) impossible > dropManyElem (x :: xs) Here = xs > dropManyElem (x :: xs) (There q') = dropManyElem xs q'

Replacing a free variable with a term is always tricky in De Bruijn notation (and it’s even harder without them). Fortunately our reliance on Elem forces us to do the right thing; the types prevent us from having index-manipulation bugs. The tricky case is Var, which we handle separately.

> substVar : (q : Elem y ys) -> Term (dropManyElem ys q) -> (x: String) -> (p : Elem x ys) -> Term (dropElem ys q) > substVar Here t x Here = t > substVar Here t x (There p') = Var x {p=p'} > substVar (There q') t x Here = Var x {p=Here} > substVar (There q') t x (There p') = shift1 (substVar q' t x p') > > subst : (q : Elem y ys) -> Term (dropManyElem ys q) -> Term ys -> Term (dropElem ys q) > subst q t (App a b) = App (subst q t a) (subst q t b) > subst q t (Lam x a) = Lam x (subst (There q) t a) > subst q t (Var x {p}) = substVar q t x p > > subst1 : (x : String) -> Term xs -> Term (x :: xs) -> Term xs > subst1 _ = subst Here

Renaming Variables. Renaming variables is a bit different from substitution, because we aren’t dropping a free variable, just replacing it. We need a helper function to replace elements from a list.

> replaceElem : (ys : List t) -> (q : Elem y ys) -> (x : t) -> List t > replaceElem (y :: xs) Here x = x :: xs > replaceElem (y :: xs) (There later) x = y :: replaceElem xs later x > > renameVar : (q : Elem y xs) -> (x : String) -> (z : String) -> (p:Elem z xs) -> Term (replaceElem xs q x) > renameVar Here x z Here = Var x > renameVar Here x z (There p') = Var z {p=There p'} > renameVar (There q') x z Here = Var z {p=Here} > renameVar (There q') x z (There p') = shift1 (renameVar q' x z p') > > rename : (q : Elem y xs) -> (x : String) -> Term xs -> Term (replaceElem xs q x) > rename q x (App a b) = App (rename q x a) (rename q x b) > rename q x (Lam z a) = Lam z (rename (There q) x a) > rename q x (Var z {p}) = renameVar q x z p > > rename1 : (x : String) -> Term (y :: xs) -> Term (x :: xs) > rename1 = rename Here

Alpha Equivalence. The easiest way to compute alpha equivalence is to ignore names and compare indices instead. We need a helper function to convert Elem proofs into indices.

> elemToIndex : Elem x xs -> Nat > elemToIndex Here = 0 > elemToIndex (There h) = S (elemToIndex h) > > alphaEq : Term xs -> Term ys -> Bool > alphaEq (App a1 b1) (App a2 b2) = alphaEq a1 a2 && alphaEq b1 b2 > alphaEq (Lam x1 a1) (Lam x2 a2) = alphaEq a1 a2 > alphaEq (Var x1 {p=p1}) (Var x2 {p=p2}) = elemToIndex p1 == elemToIndex p2 > alphaEq _ _ = False

Showing A Term. Turning a term into a string is not as easy as it may seem. We could let Idris derive a Show instance for us, but in the process we lose the ability to distinguish between Lam "x" (Lam "x" (Var "x" {p=Here})) and Lam "x" (Lam "x" (Var "x" {p=There Here})) . To get around this, we roll our own show function that generates fresh variable names as need be. Of course we first need a function to generate fresh names if necessary. Here is one possible implementation:

> freshen : String -> List String -> String > freshen x xs = if elem x xs then aux [0 .. length xs] else x > where aux : List Nat -> String > aux [] = x -- should never happen > aux (i :: is) = > if elem (x ++ show i) xs > then aux is > else x ++ show i

Then we define showRenamed which shows the structure of a term while (possibly) renaming free variables. Then show is just

> showRenamed : List String -> Term xs -> String > showRenamed ys (App a b) = concat ["(App", showRenamed ys a, " ", showRenamed ys b, ")"] > showRenamed ys (Var x {p}) = concat ["(Var ", show x', ")"] > where x' : String > x' = fromMaybe x (index' (elemToIndex p) ys) > > showRenamed {xs} ys (Lam x a) = concat ["(Lam ", show x', " ", showRenamed ys' a, ")"] > where x' : String > x' = freshen x (ys ++ xs) > > ys' : List String > ys' = x' :: ys > > Show (Term xs) where > show = showRenamed []

In Conclusion. We developed a representation of terms in the untyped -calculus that combines variable names with De Bruijn indices. These terms are friendly to read and write, and are relatively easy to manipulate while avoiding capture and subtle index bugs.

Exercise. Develop a similar representation for the simply-typed -calculus (STLC), with typed free variables. The Term type family should have type Ctx -> Typ -> Type where Ctx is a list of free variables with types, and Typ is the type of types in STLC.