The Y Combinator (no, not that one)

A crash-course on lambda calculus

What do you think of when you hear “Y Combinator”? If you’re like most of us, you’d probably think of the venture capital company based in Mountain View, California.

Well, here is the original Y combinator:

λf. (λx. f (x x))(λx. f (x x))

Please, please don’t run away. There’s a reason why Silicon Valley’s Y Combinator is named after this. It’s actually kind of awesome. (Kudos to them for picking a cool name.)

To understand what λf. (λx. f(x x))(λx. f(x x)) means and what it can do, we’ll first need to learn the basics of lambda calculus.

If you’re interested in functional programming or are curious about it, I think you might enjoy it. Don’t worry if you don’t fully understand it after your first read-through; the article won’t disappear!

Lambda Calculus

Lambda calculus (or λ-calculus) was invented by Alonzo Church in 1930 as a formal system for expressing computation. Although it has the word “calculus” in it, it is far from related to the calculus that Newton and Leibniz invented. In fact, it is a lot closer to programming than mathematics as most of us know it.

Valid λ-calculus expressions can be defined inductively as follows:

A variable x is a valid λ-term. If t is a valid λ-term and x is a variable, then λx. t is a valid λ-term. If t and s are both valid λ-terms, then t s is a valid λ-term.

Maybe that made sense, maybe it didn’t. If you’re anything like me, you might like learning through examples. So let’s build up a few λ-terms from scratch. We’ll talk about what these λ-terms mean after we have a few of them to work with.

Based on #1 (“A variable x is a valid λ-term.”), we can build these λ-terms:

x

y

Okay, that wasn’t that bad. Let’s see what we can do with #2 (“If t is a valid λ-term and x is a variable, then λx. t is a valid λ-term.”).

Having x and y as valid λ-terms in our pocket, we can now create:

λx. x

λy. y

λx. y

λy. x

And with #3 (“If t and s are both valid λ-terms, then t s is a valid λ-term.”), we can create:

x x

x y

(λx. x) y

(λy. y) x

(λx. x) (λx. x)

(λy. y) (λx. x)

…

As an exercise, try writing down some of your own λ-terms.

Let’s take a look at some of the λ-terms that we’ve created.

λx. x is known as the identity function. The x that is bolded in λx. x can be interpreted as the input, and the x that is bolded in λx. x can be interpreted as the output. That is, it takes in some input x, and outputs the same x.

If you’ve played around with languages that support lambdas, like Ruby, you might have seen it written like:

i = ->(x) { x }

(If you squint a little, you might notice that -> kind of looks like λ.)

What about λx. y? That is known as the constant function, since it ignores the input x and returns y no matter what.

c = ->(x) { y }

You might ask, “Isn’t λy. y also the identity function?” Indeed it is! λx. x and λy. y are α-equivalent (alpha-equivalent). In fact, all of the following are α-equivalent:

λx. x

λy. y

λz. z

λ☃. ☃

This leads us to the discussion of bound vs. free variables.

Bound vs. free variables

A bound variable is a variable that occurs in the body of a λ with an argument of the same name. A free variable is a variable that is not a bound variable.

For example, x is a bound variable in λx. x, but a free variable in (λy. y) x.

What’s special about bound variables is that you can rename them, as long as you do so consistently. If two λ-terms are the same up to renaming of bound variables, they are α-equivalent.

Let’s take a look at an example in code.

m = ->(horrible_variable_name) { horrible_variable_name * y }

In the above example, it’s safe to rename horrible_variable_name (a bound variable) to another name like x.

m = ->(x) { x * y }

But you can’t just go and rename y, because it’s defined outside of the scope. Maybe y is defined as the number 2, and m is a function that multiplies its input by 2. If we went and renamed it to z (let’s say it’s defined to be 0), m would turn into a function that always returns 0.

Function application

Okay, awesome. We have functions. But how do you do stuff with them? That’s where rule #3 comes into play.

If t and s are both valid λ-terms, then t s is a valid λ-term.

(λx. x) y is an example of a function application. More concretely, it represents the act of calling the function λx. x with y as an input.

You can reduce a function application using β-reduction (beta-reduction). The rules of β-reduction say that a term (λx. t) s can be reduced to t [x := s], which reads “t where all bound occurrences of x in t are substituted for s.”

For example, you can β-reduce (λx. x) s to just s by replacing all bound occurrences of x in the body (which just happens to be x in this case) with s, which gets us s. Line by line, it looks like this:

(λx. x) s

x [x := s]

s

And there, we see that λx. x does exactly what we would expect the identity function to do.

It’s also worth mentioning that you can define functions that take in more than one argument based on the inductive definition of λ-terms.

For example, take a look at the following term:

λy.(λx. x) y

… and feed it two inputs, a and b:

(λy.(λx. x) y) a b

((λx. x) y) [y := a]) b

(λx. x) a b

(x [x := a]) b

a b

It’s important to note here that function application is left-associative. That is,

(λy.(λx. x) y) a b = (((λy.(λx. x) y) a) b)

Now take a look at the following λ-term:

(λx. x x)(λx. x x)

What happens when you apply β-reduction to it?

(λx. x x)(λx. x x)

(x x) [x := (λx. x x)]

(λx. x x)(λx. x x)

Wait. Did we just end up where we started? You have just discovered Ω (omega), a divergent combinator. A λ-expression is divergent if it has no β-normal form. A λ-expression exhibits β-normal form if no β-reduction can be applied to it. A combinator is a λ-expression that contains no free variables.

Isn’t infinite recursion such a curiosity?

Speaking of recursion, how can we define something like the factorial function in λ-calculus? In simple Ruby, it might look something like:

def fact(n)

if n == 0

1

else

n * fact(n-1)

end

end

What’s special about this function? It refers to itself. You might say, “Doh, that’s just recursion.” Well, you might be shocked to hear that λ-calculus does not allow this kind of self-reference, at least not directly.