Update 28/09/2014: Agda snippets were updated to work with the latest standard library and Agda version.

Note: The Agda snippets in this post are interactive! Click any identifier to jump to its definition, be it in the standard library or earlier in this post.

As someone who formalises many programming language semantics and type systems, I am perennially interested in elegant representations of terms and judgements in proof assistants. This is an area of research that has been thoroughly investigated and no universal solution has been found (although I get the feeling that Beluga is a step in the right direction), but in this post I will explore some representational tricks that can be used to elegantly encode calculi with substructural type systems such as linear types, and in the process share with you a couple of the formalisations I’ve been working with.

To start with, we’ll look at representing the simply typed lambda calculus of Church [1940]:

Representing these terms directly, using names, is a cumbersome burden in a proof assistant. Care must be taken to avoid capture in substitution, and, more importantly, α-equivalent terms are not definitionally equal.

Instead, a common technique is to represent the variable names as natural numbers which denote the number of binders that are in scope between the name and its corresponding binder. For example, the following image represents the lambda term :

These indices, named after their inventor de Bruijn [1972], fit neatly into a dependently typed formalisation, with a little twist:

Instead of representing variable names as straight natural numbers, we parameterise terms by the number of available variables n , and then only allow available variables to be referenced inside the term by using the type of finite sets to represent variable names. This way, we tame some of the error-prone nature of de Bruijn indices by assigning different types to terms with differing numbers of available variables. Closed terms are represented as values of type Term 0 . I first encountered this trick in McBride [2003], but I know not from whence it originated.

Representing the above typing rules in Agda is also nice, using length-indexed vectors of types to represent the type environment:

This niceness all goes out the window, however, as soon as you start messing with the structure of contexts. In this post, I’ll be looking at the linear lambda calculus. For an approachable introduction to linear types, I highly recommend Wadler [1990], but I will give a brief introduction here.

In the typing rules above, we were treating our environment as though it was a set of type assignments. An alternative presentation uses a list of assignments:

Note that the application rule now splits the context, and the variable rule now disallows any additional judgements to be in the context. To recover our original type system, we must add some structural rules that make our context mimic a set:

The first rule, , simply tells us that the order of judgements in the list doesn’t matter. The second rule, , lets us use the knowledge in multiple places in the expression —- in other words, it lets us use more than once. Without this rule, we would have an affine type system. Note that affine type systems naturally disallow aliasing. The final rule, lets us throw away knowledge that without using it. Without this rule and the rule, our type system is linear —- all variables must be used exactly once.

Linear type systems have a number of interesting properties, discussed in Wadler [1990], but they pose a problem for our term representation above: The number of available variables is no longer syntax-directed! Instead, it is determined by a set of unruly typing rules. It seems as though the advantages of our de Bruijn representation above are now unattainable, however some niceness is retained by sticking with it —- if we separate the notion of a variable that is in scope but unavailable (i.e. used elsewhere in the expression) from a variable that is simply out of scope, we can keep the term structure as-is and use a partial environment as our context:

The empty environment is more accurately a family of empty environments — environments where all in scope variables are unavailable:

And the singleton environment, used in the Var rule, is defined similarly:

Then, the rules can be straightforwardly defined:

Note that the application rule now explicitly splits the context, according to the following relation:

This way, instead of expressing a split environment as a composition of two other environments, we represent context splitting as an explicit proof object, where the prover chooses which judgment goes to which expression.

We don’t need to encode the one remaining structural rule ( ) as the order has already been canonicalised by the de Bruijn indices.

Now, suppose we wanted to express the normal simply-typed lambda calculus in this style . How would we express the additional structural rules? The rule can be embedded within the context splitting relation:

This works because the only useful application of the is immediately before a context split. One might be tempted to try the same thing for :

This version makes obligations like unprovable, however. Instead, we should confine weakening to variable occurrences, just as we confine contraction to context splitting —- another slight alteration must be made to the typing rules. The split relation remains the same:

The typing rules differ, however, in that they now allow weakening in the above example, by relaxing the rule for variables :

Now, getting back to the split relation we defined earlier: it has a number of interesting properties. For starters, if one subcontext is ε , then the relation is simply equality:

The relation also is symmetric in the two subcontexts:

The relation has a number of properties that look like transitivity. One is the following:

This theorem is captured by the below diagram:

That is, if a context is split, and then split again, then you could split the context from the other direction and still end up with the same results.

An alternative, rightward-skewed version of the lemma exists as well, as a direct consequence of split-sym :

Another transitivity-ish property works simultaneously over both subcontexts:

This theorem can be visualised by the following diagram, which requires an additional dimension to adequately convey the intuition:

In other words, if a context is split and then both subcontexts are split again, the original context can be split into different subcontexts and split again, and the same subsubcontexts will be produced.

I find this relation rather curious. If anyone has seen a relation with this structure before, or knows an algebraic abstraction that applies to it, I’d be very interested to hear your thoughts.

My preliminary work using a formalisation of this style shows that using this relation to specify a type system for a language which includes linear types is quite comfortable, in both Isabelle and Agda, although in Agda the proofs are quite a bit more tedious. If anyone knows of other approaches to this problem, though, I would definitely be interested to see them.

References