Recently my research has been centered around the development of a self-certifying compiler for a functional language with linear types called Cogent (see O’Connor et al. [2016]). The compiler works by emitting, along with generated low-level code, a proof in Isabelle/HOL (see Nipkow et al. [2002]) that the generated code is a refinement of the original program, expressed via a simple functional semantics in HOL.

As dependent types unify for us the language of code and proof, my current endeavour has been to explore how such a compiler would look if it were implemented and verified in a dependently typed programming language instead. In this post, I implement and verify a toy compiler for a language of arithmetic expressions and variables to an idealised assembly language for a virtual stack machine, and explain some of the useful features that dependent types give us for writing verified compilers.

The Agda snippets in this post are interactive! Click on a symbol to see its definition.

Wellformedness

One of the immediate advantages that dependent types give us is that we can encode the notion of term wellformedness in the type given to terms, rather than as a separate proposition that must be assumed by every theorem.

Even in our language of arithmetic expressions and variables, which does not have much of a static semantics, we can still ensure that each variable used in the program is bound somewhere. We will use indices instead of variable names in the style of de Bruijn [1972], and index terms by the number of available variables, a trick I first noticed in McBride [2003]. The Fin type, used to represent variables, only contains natural numbers up to its index, which makes it impossible to use variables that are not available.

This allows us to express in the type of our big-step semantics relation that the environment E (here we used the length-indexed Vec type from the Agda standard library) should have a value for every available variable in the term. In any Isabelle specification of the same, we would have to add such length constraints as explicit assumptions, either in the semantics themselves or in theorems about them. In Agda, the dynamic semantics are extremely clean, unencumbered by irritating details of the encoding:

By using appropriate type indices, it is possible to extend this technique to work even for languages with elaborate static semantics. For example, linear type systems (see Walker [2005]) can be encoded by indexing terms by type contexts (in a style similar to Oleg). Therefore, the boundary between being wellformed and being well-typed is entirely arbitrary. It’s possible to use relatively simple terms and encode static semantics as a separate judgement, or to put the entire static semantics inside the term structure, or to use a mixture of both. In this simple example, our static semantics only ensure variables are in scope, so it makes sense to encode the entire static semantics in the terms themselves.

Similar tricks can be employed when encoding our target language, the stack machine . This machine consists of two stacks of numbers, the working stack and the storage stack , and a program to evaluate. A program is a list of instructions.

There are six instructions in total, each of which manipulate these two stacks in various ways. When encoding these instructions in Agda, we index the Inst type by the size of both stacks before and after execution of the instruction:

Then, we can define a simple type for programs, essentially a list of instructions where the stack sizes of consecutive instructions must match. This makes it impossible to construct a program with an underflow error:

We also define a simple sequential composition operator, equivalent to list append ( ++ ):

The semantics of each instruction are given by the following relation, which takes the two stacks and an instruction as input, returning the two updated stacks as output. Note the size of each stack is proscribed by the type of the instruction, just as the size of the environment was proscribed by the type of the term in the source language, which eliminates the need to add tedious wellformedness assumptions to theorems or rules.

The semantics of each instruction are as follows:

(where ), pushes to .

, pops two numbers from and pushes their sum back to .

, pops two numbers from and pushes their product back to .

, pops a number from and pushes it to .

(where Error:LaTeX failed:

This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) (preloaded format=latex) restricted \write18 enabled. entering extended mode (./working.tex LaTeX2e <2016/03/31> Babel <3.9r> and hyphenation patterns for 83 language(s) loaded. (/usr/local/texlive/2016/texmf-dist/tex/latex/base/article.cls Document Class: article 2014/09/29 v1.4h Standard LaTeX document class (/usr/local/texlive/2016/texmf-dist/tex/latex/base/size12.clo)) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xy.sty (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xy.tex Bootstrap'ing: catcodes, docmode, (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyrecat.tex) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyidioms.tex) Xy-pic version 3.8.9 <2013/10/06> Copyright (c) 1991-2013 by Kristoffer H. Rose <krisrose@tug.org> and others Xy-pic is free software: see the User's Guide for details. Loading kernel: messages; fonts; allocations: state, direction, utility macros; pictures: \xy, positions, objects, decorations; kernel objects: directionals, circles, text; options; algorithms: directions, edges, connections; Xy-pic loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyall.tex Xy-pic option: All features v.3.8 (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xycurve.tex Xy-pic option: Curve and Spline extension v.3.12 curve, circles, loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyframe.tex Xy-pic option: Frame and Bracket extension v.3.14 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xycmtip.tex Xy-pic option: Computer Modern tip extension v.3.7 (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xytips.tex Xy-pic option: More Tips extension v.3.11 loaded) loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyline.tex Xy-pic option: Line styles extension v.3.10 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyrotate.tex Xy-pic option: Rotate and Scale extension v.3.8 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xycolor.tex Xy-pic option: Colour extension v.3.11 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xymatrix.tex Xy-pic option: Matrix feature v.3.14 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyarrow.tex Xy-pic option: Arrow and Path feature v.3.9 path, \ar, loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xygraph.tex Xy-pic option: Graph feature v.3.11 loaded) loaded)) (/usr/local/texlive/2016/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/graphics-cfg/color.cfg) (/usr/local/texlive/2016/texmf-dist/tex/latex/graphics/dvips.def) (/usr/local/texlive/2016/texmf-dist/tex/latex/colortbl/colortbl.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/tools/array.sty)) (/usr/local/texlive/2016/texmf-dist/tex/latex/graphics/dvipsnam.def) (/usr/local/texlive/2016/texmf-dist/tex/latex/xcolor/svgnam.def)) (/usr/local/texlive/2016/texmf-dist/tex/latex/stmaryrd/stmaryrd.sty) (/usr/local/texlive/2016/texmf-dist/tex/latex/ccfonts/ccfonts.sty) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/eulervm.sty) No file working.aux. (/usr/local/texlive/2016/texmf-dist/tex/latex/concmath/ot1ccr.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/uzeur.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/uzeus.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/uzeuex.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/stmaryrd/Ustmry.fd) ! LaTeX Error: \begin{math} on input line 5 ended by \end{document}. See the LaTeX manual or LaTeX Companion for explanation. Type H <return> for immediate help. ... l.7 \end{document} ! Missing $ inserted. <inserted text> $ l.7 \end{document} [1] (./working.aux) ) (\end occurred inside a group at level 1) ### semi simple group (level 1) entered at line 5 (\begingroup) ### bottom level (see the transcript file for additional information) Output written on working.dvi (1 page, 488 bytes). Transcript written on working.log. ), pushes the number at position from the top of onto .

, removes the top number from .

As programs are lists of instructions, the evaluation of programs is naturally specified as a list of evaluations of instructions:

The semantics of sequential composition is predictably given by appending these lists:

Writing by proving

Having formally defined our source and target languages, we can now prove our compiler correct — even though we haven’t written a compiler yet!

One of the other significant advantages dependent types bring to compiler verification is the elimination of repetition. In my larger Isabelle formalisation, the proof of the compiler’s correctness largely duplicates the structure of the compiler itself, and this tight coupling means that proofs must be rewritten along with the program — a highly tedious exercise. As dependently typed languages unify the language of code and proof, we can merely provide the correctness proof: in almost all cases, the correctness proof is so specific, that the program of which it demonstrates correctness can be derived automatically.

We define a compiler’s correctness to be the commutativity of the following diagram, as per Hutton and Wright [2004].

As we have not proven determinism for our semantics , such a correctness condition must be shown by the conjunction of a soundness and completeness condition, similar to Bahr [2015].

Soundness is a proof that the compiler output is a refinement of the input, that is, every evaluation in the output is matched by the input. The output does not do anything that the input doesn’t do.

Note that we generalise the evaluation statements used here slightly to use arbitrary environments and stacks. This is to allow our induction to proceed smoothly.

Completeness is a proof that the compiler output is an abstraction of the input, that is, every evaluation in the input is matched by the output. The output does everything that the input does.

It is this completeness condition that will allow us to automatically derive our code generator. Given a term , our generator will return a Σ-type, or dependent pair, containing a program called and a proof that is a complete translation of :

For literals, we simply push the number of the literal onto the working stack:

The code above never explicitly states what program to produce! Instead, it merely provides the completeness proof, and the rest can be inferred by unification. Similar elision can be used for variables, which pick the correct index from the storage stack:

The two binary operations are essentially the standard translation for an infix-to-postfix tree traversal, but once again the program is not explicitly emitted, but is inferred from the completeness proof used.

The variable-binding form pushes the variable to the storage stack and cleans up after evaluation exits the scope with .

We can extract a more standard-looking code generator function simply by throwing away the proof that our code generator produces.

Compiler Frontend

Now that we have a verified code generator, as a final flourish we’ll implement a basic compiler frontend for our language and run it on some basic examples.

We define a surface syntax as follows. In the tradition of all the greatest languages such as BASIC, FORTRAN and COBOL, capital letters are exclusively used, and English words are favoured over symbols because it makes the language readable to non-programmers. I should also acknowledge the definite influence of PHP, Perl and sh on the choice of the $ sigil to precede variable names. The sigil # precedes numeric literals as Agda does not allow us to overload them.

Unlike our Term AST, this surface syntax does not include any scope information, uses strings for variable names, and is more likely to be something that would be produced from a parser. In order to compile this language, we must first translate it into our wellformed-by-construction Term type, which necessitates scope-checking.

Note that this function is the only one in our development that is partial: it can fail if an undeclared variable is used. For this reason, we use the Applicative instance for Maybe to make the error handling more convenient.

Our compiler function, then, merely composes our checker with our code generator:

Note that we can’t really demonstrate correctness of the scope-checking function, save that if it outputs a Term then there are no scope errors in , as it is impossible to construct a Term with scope errors. One possibility would be to define a semantics for the surface syntax, however this would necessitate a formalisation of substitution and other such unpleasant things. So, we shall gain assurance for this phase of the compiler by embedding some test cases and checking them automatically at compile time.

If we take a simple example, say:

We expect that this program should correspond to the following program:

We can embed this test case as a type by constructing an equality value — that way, the test will be re-run every time it is type-checked:

As this page is only generated when the Agda compiler type checks the code snippets, we know that this test has passed! Hooray!

Conclusions

Working in Agda to verify compilers is a very different experience from that of implementing a certifying compiler in Haskell and Isabelle. In general, the implementation of a compiler phase and the justification of its correctness are much, much closer together than in Agda than in my previous approach. This allows us to save a lot of effort by deriving programs from their proofs.

Also, dependent types are sophisticated enough to allow arbitrary invariants to be encoded in the structure of terms, which makes it possible, with clever formalisations, to avoid having to discharge trivial proof obligations repeatedly. This is in stark contrast to traditional theorem provers like Isabelle, where irritating proof obligations are the norm, and heavyweight tactics must be used to discharge them en-masse.

My next experiments will be to try and scale this kind of approach up to more realistic languages. I’ll be sure to post again if I find anything interesting.

References