About induction on the Calculus of Constructions

This post is meant to be a summary of my current understanding of this matter, and to present some naive ideas that are probably nonsense.

1. There is no induction on CoC

It has been known for a while that it is impossible to derive induction on the Calculus of Constructions (CoC). That is, there is no term of type:

∀ (P : Nat -> Set) ->

∀ (S : (n : Nat) -> P n -> P (Succ n)) ->

∀ (Z : P Zero) ->

∀ (n : Nat) ->

P n

(I’m using Morte's Church-style syntax because it is familiar.)

In other words, no matter what “smart” encoding of Nat , Succ , Zero you develop, you’ll never have a term of that type. Due to the importance of inductive reasoning for math, that (among other things I do not understand) resulted in the creation of the Calculus of Inductive Constructions, an extension to the otherwise simple CoC which gave it native inductive datatypes, and is the foundation of Coq. That extension was hugely complex, which made some people angry (spoiler: me) because their brain wasn’t big enough for them to understand and implement all of it themselves. Those people kept looking for simpler languages that were equally expressive.

2. Self Types

One of the most interesting attempts at this comes from Aaron Stump, who, in 2014, presented Self Types, a small extension to CoC which does the trick. The idea is that, if we extend CoC with just one construct, ι x. T , which allows the type T to refer to the value it types. With that, induction can be expressed as:

Nat : Set

Nat = ιn. (P : Nat -> Set) -> ((n : Nat) -> P n -> P (Succ n)) ->

(P Zero) -> P n Succ : Nat -> Nat

Succ = λn -> λP -> λS -> λZ -> S n (P n s z) Zero : Nat

Zero = λP -> λS -> λZ -> Z ind : (P : Set) -> ((n : Nat) -> P n -> P (Succ n)) ->

(P Zero) -> (n : Nat) -> P n

ind = λP -> λS -> λZ -> λn -> n P S Z

(Note that I used a Curry-Style rather than a Church-Style here, because that’s how Aaron’s proposal is presented. It isn’t clear how to present Self types with Church-style. Also, for simplicity, I ignored distinctions regarding explicit/implicit products.)

The core of the idea is simple: by allowing a type to refer to its typed value, we have that, inside the definition of ind , when we apply n to P S Z , it returns P n , as expected, simply because the type of Nat specifies so. Without self types, we didn’t have a way to express such Nat type, for the mere reason there wasn’t a n in scope so we could return a P n ; that’s, in simple terms, essentially the reason we can’t have induction on CoC.

3. Dependent Intersections

Sadly, the implementation above requires mutual recursion, as the constructor of Nat needs access to its constructors Succ / Zero , and vice-versa. That, as far as I understand, made it hard to provide a semantics for self types. Eventually, Aaron found that dependent intersections, a previously existing construct, actually generalize self-types, allowing one to prove induction in a very similar fashion.

The idea is that we must, first, implement Nat in two slightly different ways, one “simple” ( CNat ) and one “inductive” ( INat ). The later view refers to the former, so, there is no mutual recursion involved. Plus, since INat is a predicate on CNat , it has an n in scope, so it can return P n . Both implementations are then “merged together” in a last Nat type, by using dependent intersections. Then induction is derived for that type. Here is more or less how it goes:

-- Simple Nats CNat : Set

CNat = ∀ (P : *) -> ∀ (S : P -> P) -> ∀ (Z : P) -> P CSucc : CNat -> CNat

CSucc = λn -> λP -> λS -> λZ -> S (n P S Z) CZero : CNat

CZero = λP -> λS -> λZ -> Z -- Inductive Nats INat : CNat -> Set

INat = λn -> (P : CNat -> *) -> ((n:CNat) -> P n -> P (CSucc n)) ->

(P CZero) -> P n ISucc : (n : CNat) -> INat n -> INat (CSucc n)

ISucc = λn -> λi -> λP -> λS -> λZ -> S n (i P S Z) IZero : INat CZero

IZero = λP -> λS -> λZ -> Z -- Actuall Nat type is the intersection of those 2 Nat : Set

Nat = ιx. CNat (INat x) IZero : Nat

IZero = [CZero, IZero] ISucc : Nat -> Nat

ISucc = λn. [CSucc n.1, ISucc n.1 n.2] ind : (P : Set) -> ((n : Nat) -> P n -> P (Succ n)) ->

(P Zero) -> (n : Nat) -> P n

ind = …

The proof of ind is a little more complicated, so I’ve omitted it here. It is very clean, though, merely requiring you to access both views of the dependent intersection in the right moment and proving they are equivalent (the language also requires equality primitives, and implicit products, so that both views can be considered equal up to erasure). It erases to the same proof we had with Self-types (identity!), which is great.

Such language is interesting not only because of induction. It also allows us to encode O(1) eliminations, something that λ-encodings also used to lack. It can even feature crazy advanced things such as “insane dependent types”, which, if I understand correctly, allow arguments of multi-arg dependent functions to depend on each other indiscriminately (not only to previous ones), and is very powerful.

Almost sounds too good to be true, but Aaron, amazingly, managed to develop a semantics for it. This is, thus, the simplest known, consistent proof assistant capable of deriving induction. It is the foundation of Cedille, which recently released its version 1.0. Sadly, this approach is a little more complex both on implementation and programming side (since, now, every datatype must be replicated in essentially 3 slightly different ways). Fortunately, syntax sugars for inductive datatypes and elimination can mask that, and, compared to previous attempts, Cedille is astonishingly simple.

4. ?????

As I wait for development on Cedilleum, a version of Cedille adjusted for Ethereum, I’ve been playing with those ideas and trying to get a better grasp on them. Something I recently noticed is that, assuming mutual recursion, if we slightly alter CoC in such a way that, in (X : TYPE) -> BODY , X is bound in TYPE , then we can accomplish what self types do without any further addition. Here is an example (now, again on Church style):

Nat =

λ (n : Nat n)

∀ (P : (n : Nat n) -> Set)

∀ (S : (n : Nat n) -> P n -> P (Succ n))

∀ (Z : P Zero)

P n Zero =

λ (P : (n : Nat n) -> Set)

λ (S : (n : Nat n) -> P n -> P (Succ n))

λ (Z : P Zero)

Z Succ =

λ (n : Nat n)

λ (P : (n : Nat n) -> Set)

λ (S : (n : Nat n) -> P n -> P (Succ n))

λ (Z : P Zero)

S n (n P S Z) ind =

λ (P : (n : Nat n) -> Set)

λ (S : (n : Nat n) -> P n -> P (Succ n))

λ (Z : P Zero)

λ (n : Nat n)

n P S Z

Here, Nat is a predicate on itself ( Nat : Nat -> Set ). This allows its type to return P n , replicating what self-types do. In a rough way, it can be seen as a dependent intersection compressed to avoid code replication. Then, each number gets an unique type; for example, Zero : Nat Zero , Succ Zero : Nat (Succ Zero) , and so on. But that is fine, because we can accept all of them in a λ (n : Nat n) -> ... function, so, those infinitely many types behave as one. Finally, induction becomes trivial, just as with self-types.

I kinda like this idea because it is similar to Self-types but without any addition at all, just a different take on type-checking. I’ve implemented it and the changes are pretty natural and straightforward, about 3 lines of code, I’d say. The whole thing works amazingly well and cleanly. Of course, mutual recursion itself is not justified at all. I’m not even close to having the expertise necessary to give a semantics to that, and it’d probably fall in the same problems that Aaron had with self types (and he personally is sceptic that’d be possible). But it is a thing and type-checks, so I wanted to share it anyway.