April 26, 2014 — Evan Jenkins

You can #timestamp it, folks. I’ve seen the future, and it’s dependently typed. We won’t merely teach the homeless to code. Rather, me and you and everyone we know will have a whole new framework in which to think, learn, and interact with the world. When the day comes to start running software on our brains, we’ll only need to upload one program: a type checker.

I know what you’re thinking: he’s totally lost it. I was like you once. My eyes were closed to the endless possibilities of dependent types. But don’t worry, I’m here to awaken you.

All right, what you’re actually thinking, assuming you’re still reading this, is “what the heck are dependent types, and why is this man getting so unreasonably worked up about them?” I will try to contain my mælstrom of emotion long enough to answer. I’m going to be incredibly vague and probably wrong. Deal with it.

Programming languages can be roughly divided into two classes: imperative languages and functional languages. In an imperative language, you tell the computer what sequence of steps it should take to solve a given problem. In a functional language, you describe the problem to the computer, and it solves it for you. (I did say this was going to be incredibly vague and probably wrong!) Now, you might think that the second way is the way to go because the first way requires you to do the hard work of figuring out how to solve the problem. Let’s have the computer do that, right?

The problem is that like your in-laws, computers are hard to talk to. The languages they speak tend to have limited expressivity. Describing how to solve a problem step-by-step doesn’t require expressing very much (do this, then do that, carry the 2, mumble grumble), the questions we want the computer to help us answer, if they’re at all interesting, tend to be much grander in scope: we might be trying to answer questions that are out in the real world. As scientists (and even as regular people), we greatly simplify the real world with models, but even describing your model to a computer can be hard because everything needs to be made very precise. And while it is unquestionably a Good Thing™ to make your models precise, doing so in the context of a computer language can be a tortuous process: it’s easy to get lost in the process of translation and not realize when we’ve made a conceptual error.

This is the fundamental flaw with most programming languages, particularly the functional kind. The process of honing a model should be edifying, but instead we are forced to translate the beautiful poetry that pours out of our brains into a grunty computer tongue. Yes, many people have gotten very good at this process! But I think they are good at it in spite of the tools they’re using. What we really need is a programming language with a word for X, where X is a concept relevant to the problem at hand. We need a programming language that will chide us not just when we’ve made a grammatical error but also when we’ve said something whose meaning is not quite right, like “vaccines cause autism.” In other words, we want the computer to understand not just our syntax, but our semantics.

Now I need to tell you about types. In procedural languages, we (typically) have the notion of a data type, which is just a label we attach to pieces of data that say what type of data they are. So we might attach the label int (short for integer) to the data 8 , or the label string to the data "Hello, world!" . If you’re lucky, your procedural language will care enough about types to tell you when you’ve done something silly with them: it’ll OK adding 8 and 4 , but not adding 8 and "Hello, world!" But it will also OK dividing 8 by 0 . Note that I am talking here about compile-time behavior. When you actually run your program that divides 8 by 0 , the universe will implode. (As we all know, that’s what happens when you try to divide by zero.)

If you’re a smart cookie, you might try to prevent the cessation of existence itself by writing your own dividing function that checks first to see if you’re trying to divide by zero. Here’s some pseudo-code:

float myDivide(float a, float b) { if (b == 0) return ???; else return a / b; }

The function, myDivide , takes two floating-point numbers a and b , and returns their quotient, which is another floating point number. We want to check first that b isn’t zero before dividing them. The problem here is that we don’t know what the function should output if b does happen to be zero. The function has to output some floating point number! Now, if we’re lazy, we might just pick some number to output in this case, like 0 . After all, we’re just trying to prevent the universe imploding here: we don’t actually plan on dividing by zero and then trying to use the number we get out of it! The problem, of course, is that we might not realize we’re trying to divide by zero. If we are, then something’s definitely wrong, and we want to know about it!

At the heart of the matter is the fact that the mathematical operation we’re trying to implement, division, doesn’t just take two numbers a and b and output a third. We have to be precise, remember? It takes a number a and a non-zero number b , and outputs a third number. So what we can do is create a new type, nonzerofloat, to represent any floating point number that isn’t 0. Then, we can just define our function as follows:

float myDivide(float a, nonzerofloat b) { return a / nzfloattofloat(b); }

Here, nzfloattofloat is an auxiliary function that takes as input a nonzerofloat and outputs that number as a float . Now, if we try to do myDivide(8, 0) , the compiler will yell at us because 0 doesn’t have type nonzerofloat . What we’ve done here is to increase the expressiveness of the type system: an error that would’ve imploded the universe at runtime is now caught by the type checker.

This is solution 1 (of an eventual 3) to the divides-by-zero problem. And to be honest, it stinks. Why? Because if your program is doing some highfalutin computations, you’re not just going to be dividing explicit numbers. You’re going to need to divide the result of one computation by the result of another computation. And how do you know the result of that other computation isn’t zero? The type checker certainly doesn’t. And you can’t even do something like

if (bar != 0) baz = myDivide(foo, bar);

because the type checker can only check types; it doesn’t have access to the computation bar != 0 because that computation won’t be performed until the program is run. We need to do better.

Functional programming languages up the game by using types to describe not just data but functions as well. So the function

int plusOne(int a) { return a + 1; }

which takes an int and outputs another int will itself have the type int -> int .

Why is this useful? One reason is that it allows us, in a fancy-pants language like Haskell, to instantiate our functions inside various computational contexts, or monads as they are called to scare you. For example, one computational context is lists. A list [2, 3, 4] of integers will have a type [int] or list int . We can also apply the list functor to our function plusOne . So while plusOne had type int -> int , the function [plusOne] has type [int] -> [int] . And it does exactly what you think it should: [plusOne]([2, 3, 4]) = [3, 4, 5] .

But the functoriality isn’t why monads are interesting. I called lists a computational context, but come on, you can do that with a for loop. The interesting part of lists is that they can represent non-deterministic computations. For instance, we might want to define a “function” that takes an integer and returns a proper divisor of that integer. Now, this isn’t really a function between integers, because a number can have many divisors. But we can define a function divisor of type int -> [int] that returns the list of all divisors. Now I might want to chain together a number of such “non-deterministic functions” into a larger non-deterministic computation. Let’s say I want to pick a digit of a divisor of a number. I have the divisor function that sends a number to its list of divisors, and then the digit function that sends a number to its list of digits, both of type int -> [int] . Now, I can’t compose these two functions directly, because the output of divisor is of type [int] while the input of digit is of type int . But if I do what I did with plusOne above, I can feed the output of divisor into the function [digit]: [int] -> [[int]] that takes a list of integers and returns a list of lists of digits. The last step is to concatenate all these lists of digits to be left with a single list.

Whew. Let’s go back to division. One of the simplest Monads is the Maybe monad. You can think of it as a list with either 0 or 1 element: something of type Maybe Int is either a 1-element list like [4] or the empty list [] . Whereas the list monad allowed us to describe general non-deterministic computations, the Maybe monad allows us to describe computations either have a definite answer or no answer. Like dividing by zero!

(Maybe float) myDivide2(float a, float b) { if (b == 0) return []; else return [a / b]; }

This looks a lot like our original attempt, but now we have a separate symbol to return in the case of division by zero. What’s more, functoriality lets us chain this Maybe-function together with ordinary functions, and our fancy concatenation procedure (called “binding”) lets us chain this Maybe-function with other Maybe-functions.

So this is our second solution. It’s not bad, I guess. Why am I not enthused? Because at the end of whatever grand computation we do, we’ll maybe have an answer. We still won’t know until we actually run the program! While this is undoubtedly more graceful than allowing the universe to implode, it still leaves a sour taste in my mouth. All we’ve really done is built a fancy machine for propagating the fact that we screwed up throughout the rest of our program. We might as well just interrupt our program on dividing by zero, which is how procedural languages handle things.

In a procedural language, the type checker just knows about data types. Any sort of error handling or more enlightened analysis of what’s going on needs to be inserted in a very ad hoc manner. In a functional language, the type checker knows about function types. This lets Haskell programmers do very fancy things to protect the universe from implosion, but not only can this become very hard to understand once we start putting monads in monads, it also ultimately doesn’t really help us: we still need to run the program to observe that the computation fails somewhere and then figure out what happened.

The problem with all of these programming languages is that they don’t get us. They don’t get what we’re trying to do. We never actually want to divide by zero. If we want to divide by something, we have a reason to believe that it’s nonzero. What we need to do is to make our beautiful brain poetry, our reason to believe that what we’re dividing by is nonzero, precise. We need a proof. Without further ado, here’s our final, most glorious division function:

float myDivide3(float a, float b, proof(b != 0) p) { return a / b; }

What’s going on here? Our function takes three arguments now: our old friends a and b , together with a new guy, p . This new guy is a proof that b is nonzero, and its type depends on the value of b . This means that whenever I call this function, I need to provide together with a and b a proof that b isn’t zero. This may seem cumbersome, but like eating our vegetables, it’s good for us: it makes us clarify why we think we can divide by b at all. If you can’t do that, you have no business dividing!

The interior of this function is a lie, but a suggestive one. I wanted to make it clear that this function will always return a / b . Always. No maybes, no world imploding; we’re going to get a real, honest floating point number out of this function every time. In reality, the interior of our function will be some implementation of a division algorithm, which will use the fact that b != 0 to show that the algorithm terminates.

Allowing dependent types is the final piece of the puzzle. In a dependently typed programming language, a program isn’t just a program, it’s a computer-verified proof that your program will terminate. The language of dependent type theory allows us to express all of our beautiful thoughts about why the pieces of our model fit together the way we think they do. We might, in the process, discover that things don’t actually fit together right: the type checker will growl at us if they don’t. This can be tough if you’re just learning a language. But look at the upside: it won’t let us write bad code!

So why do I think that dependently typed programming is insanely great? It has to do with how we reason, and how we can learn to reason better. I’ve observed, teaching mathematics, that many of the fundamental errors students make are type errors. For instance, a beginning linear algebra student might try to find a subspace of a single vector. (Qiaochu Yuan has a very nice post expounding on these sorts of errors.) It’s tempting, as a teacher, to ignore milder forms of such errors (like mixing up subspaces and bases of subspaces) when students can churn out answers that are at least identifiably correct. But type errors are at the very least a strong indication that there’s some conceptual misunderstanding. I’ve often felt that mathematics education could be greatly improved by implementing stronger typing in our teaching, making clear what sorts of operations can be performed on what sorts of objects.

Dependent type theory takes this idea and runs with it. Type errors are not just red flags: in a sufficiently well-specified theory, all errors are type errors. Of course, I’m excluding routine mistakes like arithmetic errors, but those are the least important kind now that we have computers to do routine calculations for us. Rather, reasoning within the type theory of mathematics helps prevent us from making conceptual errors.

If teaching dependent type theory to schoolchildren seems like a bit of a stretch, it is. But the beautiful thing about dependent type theory is that it is expressive enough to express much simpler languages. One such language currently under development is AML, a modeling language for actuaries. By the end of this brief overview of AML, I was ready to jump up and down on the table. Why? Because money is a dependent type! It depends on time! It doesn’t make sense to add, compare, or otherwise mix two different sources of money unless they’re discounted to the same time. This is an error that’s easy to make when reasoning about money if you’re not careful. But in AML, it’s a type error: you just can’t do it!

After reading that, I was ready to turn everything into a type error. Make a chart with misleading axes? Type error! Use massively bloated animated GIFs instead of HTML5 video? Type error, my friend! What a wonderfully refreshing view of the world. Since deciding to leave the friendly confines of academia and search for a real job, I’ve tried to find a programming language to call my own. Each one just didn’t feel quite right until I hit upon Idris, a dependently typed language with a Haskell-like syntax. Finally, here was a language that could think as I do. Idris, you get me. And while Idris is still in its early stages, I don’t think the day is too far off when we’ll be using Idris, or a language like it, or a language created with it, to do mathematics and science and statistics and learning and living and loving. And that day will be awesome.

Comments? Tweet