My long and not-yet-finished series where I did a line-by-line exegesis of the seminal paper on ML type inference ended with a discussion of the type unification algorithm. In reading back over that episode I realize to my horror that I never defined “unification”! And this in a series intended to demystify the jargon inherent in papers about type systems. My bad. Apologies for that, and let’s fix that problem right now.

“Unification” is just a fancy word for “we have a set of statements that contain placeholder variables; find values for those variables that make every statement true”. That is to say, unification is a generalization of the problem of “solve this equation for x”.

The classic example from high school algebra is the system of linear equations, where we have something like

X + Y = 10 X - Y = 2

(As I might have noted before, we call these placeholders “variables”, though they bear only a small similarity to what we call “variables” in C# — that is, typed, mutable storage locations. As we did in the previous series, where we discussed “free” and “bound” type variables, I’ll continue to use “variable” in this traditional mathematical sense of a placeholder for a value to be supplied later via substitution, and not in the C# programming sense.)

The unification problem is to find values for X and Y such that the statements are both true; in this case if we substitute 6 for X and 4 for Y, both statements are true. Moreover, this is the unique such substitution; of course there are systems of equations where there are multiple substitutions, infinitely many substitutions, or no substitutions at all; in the latter case we say that unification fails.

A unification algorithm is one which takes in a set of statements with variables and returns substitutions for the variables that makes every statement true. As we know, the algorithm for unifying systems of linear equations on real numbers is straightforward, but as we make those equations more complicated — by restricting them to integers, by making them non-linear, and so on — then the algorithms get more and more complicated.

In our type system example, the statements which we wish to be true are called “assumptions”. An assumption can contain a type variable, which we denoted with a Greek letter. Our type inference algorithm’s goal was to use type unification to find a set of substitutions for those variables which makes the assumptions all true. An example that I gave was an assumption:

plus : int -> int -> int

And we have an expression we wish to assign a type to:

λx.(plus x x)

We add to our assumptions an assumption that contains a variable:

x:β

And then we run our type inference algorithm on the body of the lambda. It returns a substitution — that β is int, and therefore so is x, and now we have enough information to deduce that our expression is a function from int to int.

Here’s an interesting question. We’ve seen how to unify systems of linear equations, and we’ve seen how to unify types. Can we unify arbitrary expressions in a programming language?

Next time on FAIC: Yes, yes we can.