← →

Programming with effects I: Theory

Andrej Bauer

, , , , ,

[UPDATE 2012-03-08: since this post was written eff has changed considerably. For updated information, please visit the eff page.]

I just returned from Paris where I was visiting the INRIA ?r² team. It was a great visit, everyone was very hospitable, the food was great, and the weather was nice. I spoke at their seminar where I presented a new programming language eff which is based on the idea that computational effects are algebras. The language has been designed and implemented jointly by Matija Pretnar and myself. Eff is far from being finished, but I think it is ready to be shown to the world. What follows is an extended transcript of the talk I gave in Paris. It is divided into two posts. The present one reviews the basic theory of algebras for a signature and how they are related to computational effects. The impatient readers can skip ahead to the second part, which is about the programming language.

A side remark: I have updated the blog to WordPress to 3.0 and switched to MathJax for displaying mathematics. Now I need to go through 70 old posts and convert the old ASCIIMathML notation to MathJax, as well as fix characters which got garbled during the update. Oh well, it is an investment for the future.

Algebras and homomorphisms

An algebra $(A, f_1, \ldots, f_k)$ is a set $A$ together with maps $f_i : A^{n_i} \to A$ for $i = 1, \ldots, k$, called the operations. The number $n_i$ is the arity of operation $f_i$. The quintessential example of an algebra is a group, which is a set $G$ together with three operations $u : 1 \to A$, $i : A \to A$, and $m : A^2 \to A$ of arities 0, 1, and 2, respectively. These are of course the unit, the inverse, and the multiplication, and for $G$ to really be a group they have to satisfy the usual group axioms.

For our purposes we need to make two adaptations to the classical view of algebras. Given an $n$-ary operation $f : A^n \to A$ and elements $t_1, \ldots, t_n \in A$, we can form the element $f(t_1, \ldots, t_n) \in A$. The $n$-tuple $(t_1, \ldots, t_n)$ is equivalent to the map $\lambda i . t_i$ where $i \in {1, 2, \ldots, n}$. So instead of $f(t_1, \ldots, t_n)$ we shall write $f(\lambda i . t_i).$ Now one quickly gets the idea that there is no need to restrict arity to sets of the form ${1, 2, \ldots, n}$. Thus we allow any set $N$ to be the arity of an operation $f : A^N \to A$. (I am using $A^N$ and $N \to A$ as two different notations for the same thing.) The $\lambda$-notation neatly accommodates general arities: given a map $k : N \to A$ and an operation $f: A^N \to A$ we can form the element $f(k) \in A$. An example of an operation with a strange arity is integration $\int_0^1 : \mathbb{R}^{[0,1]} \to \mathbb{R}$ which takes a (measurable) map $k : [0,1] \to \mathbb{R}$ to its integral. Its arity is the closed unit interval.

We shall need operations that take parameters, too. An example of a parametrized operation is scalar multiplication $m : \mathbb{R} \times V \to V$ which takes a scalar and a vector and returns a vector. The arity of $m$ is 1, but $m$ also takes an additional parameter from $\mathbb{R}$. In general a parametrized operation on a set $A$ is a map

$$f : P \times A^N \to A$$

where $P$ is the set of parameters, $N$ is the arity and $A$ is the carrier. We call the pair $(P,N)$ the type of the operation $f$.

A signature $\Sigma$ is a list of operation names together with their types:

$$f_1 : (P_1, N_1), \ldots, f_k : (P_k, N_k).$$

For example, the signature for a group is $u : (1,0), i : (1,1), m : (1,2).$ A $\Sigma$-algebra is an algebra which has operations prescribed by $\Sigma$. The category of $\Sigma$-algebras has as objects the $\Sigma$-algebras and as morphisms maps which preserve the operations, also known as homomorphisms. The free $\Sigma$-algebra $F\Sigma(X)$_ generated by a set $X$ is defined inductively by the rules:

for every $x \in X$, $\mathtt{return}(x) \in F_\Sigma(X)$, if $f$ is an operation name of type $(P,N)$, $p \in P$, and $k : N \to F_\Sigma(X)$ then the symbolic expression $f(p,k)$ is an element of $F_\Sigma(X)$.

The reason for writing $\mathtt{return}(x)$ instead of just $x$ is that it should remind you of $\mathtt{return}$ from Haskell’s monads. The elements of $F_\Sigma(X)$ are best thought of as well-founded trees whose leaves are labeled with elements of $X$ and whose branching types are the operations from $\Sigma$.

Given a $\Sigma$-algebra $A$ and a map $r : X \to A$ there is a unique homomorphism $h : F_\Sigma(X) \to A$ such that, for all $x \in X$,

$$h(\mathtt{return}(x)) = r(x)$$

and for every operation $f : (P, N)$ in $\Sigma$, $p \in P$, and $k : N \to F_\Sigma(X)$,

$$h(f(p,k)) = f_A(p, h \circ k).$$

Here $f_A : P \times A^N \to A$ is the operation on $A$ corresponding to $f$. You may view these two equations as a recursive definition of $h$.

Two examples from programming languages

Let us consider two examples of algebras from programming languages.

The first one is mutable store, i.e., a program computing a value of type $T$ with access to one mutable memory cell of type $S$. The content of the cell can be read with operation $\mathtt{lookup}$ and written to with operation $\mathtt{update}$. The situation is modeled with an algebra whose carrier is $T^S$, i.e., a program that computes the result, given the content of the memory cell. The operations

$$\mathtt{lookup} : (T^S)^S \to T^S,

\qquad\qquad

\mathtt{update} : S \times T^S \to T^S$$

are defined as

$$\mathtt{lookup} (k) = \lambda s . k \; s \; s,

\qquad\qquad

\mathtt{update} (t, k) = \lambda s . k \; t.$$

Thus $\mathtt{lookup}$ accepts no parameters (in other words, the parameter space is the singleton 1) and has arity $S$, while $\mathtt{update}$ accepts a parameter of type $S$ and has arity $1$. The two operations satisfy various equations, for example

$$\mathtt{update} (t, \mathtt{update} (u, k)) = \mathtt{update} (u, k)$$

and

$$\mathtt{lookup} (\lambda s . \mathtt{update} (s, k)) = k.$$

The first one says that writing $t$ then $u$ to the memory cell is the same as writing just $u$. The second one says that writing the value that has just been looked up is the same as doing nothing. You may try writing down all the relevant equations for $\mathtt{lookup}$ and $\mathtt{update}$ before you look them up in papers of Gordon Plotkin and John Power, or Matija’s thesis.

Observe that $\mathtt{lookup}$ and $\mathtt{update}$ are polymorphic in the type $T$ of the result. This means that we can use them anywhere in the program, no matter what happens afterwards. Many operations are like that, but not all (we shall see examples later on).

In my experience, two points of confusion arise:

What is $k$ in $\mathtt{lookup}(k)$? Answer: it is the rest of the program, it is what happens after lookup, it is the continuation. Does $\mathtt{update} (t, \mathtt{update} (u, k))$ mean that we update with $t$ and then with $u$, or the other way around? After all, in $\log(\cos(\pi/4))$ we have to compute the inner cosine first and the logarithm second. Answer: update with $t$ happens before update with $u$ because the argument of an operation is the stuff that happens after the operation.

As a second example we take exceptions. To make things simple and familiar, we shall consider just one exception called $\mathtt{fail}$ which takes no parameters. In Haskell such a thing is modeled with the Maybe monad, so let us use Haskell-like notation. The type of $\mathtt{fail}$ is

$$\mathtt{fail} : \mathtt{unit} \to \mathtt{Maybe}\; T,$$

and its action is $\mathtt{fail}() = \mathtt{Nothing}$. This is a bit mysterious but if we write the type of the operation out in full,

$$\mathtt{fail} : 1 \times (1 + T)^0 \to 1 + T,$$

we see that $\mathtt{fail}$ takes no parameters, is a nullary operation, and that a (possibly) aborting program returns either $\mathtt{Nothing}$ or a result $\mathtt{Just}\;x$. How do we model exception handlers? Gordon Plotkin and Matija Pretnar had the important idea that a piece of code with unhandled $\mathtt{fail}$ amounts to an element of the free algebra for the signature $\mathtt{fail} : (1,0)$, while handled $\mathtt{fail}$ corresponds to some other algebra for the same signature. The unique homomorphism from the free algebra to the other one is the handling construct. To see this, consider the following piece of (pseudo)code:

handle: a = 1 b = 2 fail() return (a + b) with: operation fail(): return (42)

The code between handle and with is an element of the free algebra, namely the element $\mathtt{fail}()$. Ok, you are wondering where the rest went, but consider this: either the program raises $\mathtt{fail}$, in which case it is equivalent to $\mathtt{fail}()$, or it returns a result $x$, in which case it is equivalent to $\mathtt{return}(x)$. Or to put it in a different way: we can delete all the stuff that comes after $\mathtt{fail}$ because it will never happen, while things before it have no effect on the end result either (provided they are free of computational effects). The with part tells us that $\mathtt{fail}()$ should be interpreted as $42$. But this is precisely what an algebra for the signature $\mathtt{fail} : (1,0)$ is: a set with a chosen element, in our case the set of integers with the chosen element $42$.

Handlers

All of what we have just said about exceptions generalizes to arbitrary signatures and handlers: every algebraic computational effect has an associated notion of handlers. In eff you can handle (intercept, catch) not only exceptions, but also input, output, access to memory, etc. Let us look at the theoretical background for this idea.

Let $\Sigma$ be the signature $f_1 : (P_1, N_1), \ldots, f_k : (P_k, N_k)$. An element $t$ of the free algebra $F_\Sigma(X)$ is a well-founded tree whose nodes are the operations $f_i$ and the leaves are elements of type $X$. We think of $t$ as a piece of inert source code of type $X$ with unhandled operations $f_1, \ldots, f_k$. In order to give $t$ a computational interpretation we need to explain how the operations $f_i$ are interpreted. We might write something like this:

handle: t with: operation f_1(p,k): t_1(p,k) operation f_2(p,k): t_2(p,k) ... return (x): r(x)

This means that in $t$ the operation $f_i(p,k)$ is interpreted as $t_i(p,k)$, and that $\mathtt{return}(x)$ is interpreted as $r(x)$. We have seen this before in a different guise. Namely, the above handler is precisely the unique homomorphism $h : F_\Sigma(T) \to A$ into the $\Sigma$-algebra $A$ whose operations are $t_1, \ldots, t_n$, and such that $h(\mathtt{return}(x)) = r(x)$.

A handler may handle only some operations and leave others alone. Let $\Sigma = (f,g)$ be a signature with two operations $f$ and $g$, and we do not bother to write down their types. Let $t \in F_{(f,g)}(X)$ be a term of type $X$ with unhandled operations $f$ and $g$. The handler

handle: t with: operation f(p, k): u(p, k) return (x): r(x)

only handles $f$ and leaves $g$ alone. How should we interpret this? Well, we could say that both operations are handled, except that $g$ is handled by itself:

handle: t with: operation f(p,k): u(p,k) operation g(p,k): g(p,k) return (x): r(x)

Now it is clear that the handler corresponds to the unique homomorphism $h : F_{(f,g)}(X) \to F_{(g)}(Y)$ such that $h(f(p,k)) = u(p, h \circ k)$, $h(g(p,k)) = g(p, h \circ k)$ and $h(\mathtt{return}(x)) = r(x)$. As an exercise you should figure out the types of $u$ and $r$ and write down precisely the algebra that is the codomain of $h$.

Generic effects and sequencing

We have seen that an operation accepts a parameter and a continuation. We cannot expect programmers to write down explicit continuations all the time, so we switch to an equivalent but friendlier syntax known as generic effects and sequencing (or “do notation” in Haskell). An operation applied to parameter $p$ and continuation $\lambda x . c$,

$$f(p, \lambda x . c),$$

is written with generic effects and sequencing as

$$\begin{split}&x \leftarrow f_\mathrm{gen}(p) \ & c.\end{split}$$

Read this as: “First perform generic effect $f_\mathrm{gen}$ applied to parameter $p$, let $x$ be the return value, and then do $c$.”

If $f$ is an operation of type $(P,N)$ then $f_\mathrm{gen}$ is a map $f_\mathrm{gen} : P \to F_\Sigma(N)$ defined by $f_\mathrm{gen}(p) = f(p, \lambda x . \mathtt{return}(x))$.

Sequencing is a special kind of handler. If $t \in F_\Sigma(X)$ and $u : X \to F_\Sigma(Y)$ then sequencing

$$\begin{split}&x \leftarrow t \ & u(x)\end{split}$$

denotes the element $h(t) \in F_\Sigma(Y)$ where $h : F_\Sigma(X) \to F_\Sigma(Y)$ is the unique homomorphism satisfying

$$h(\mathtt{return}(x)) = u(x)$$

and, for every operation $f_i$ in $\Sigma$,

$$h(f_i(p,k)) = f_i(p,h \circ k).$$

What these equations amount to is that sequencing passes the operations through and binds return values.

A minor complication ensues when we try to write down handlers for generic effects. A handler needs access to the continuation, but the generic effect does not carry it around explicitly. Eff uses the yield keyword for access to the continuation.

Instances of algebras

Mathematicians often take an algebra $A$ and consider two instances of it, for example in order to form a product $A \times A$. While such a move requires no special notation in mathematics, we need to be more careful about it in a programming language. Each instance of an algebra represents an instance of a computational effect which may require its own resources. For example, if $R$ is an algebra modeling a mutable memory cell, then having two instances of $R$ will be like having two mutable memory cells. In eff instances of algebras are created with the with keyword:

with E as x: ...

The above code creates an instance of effect $E$ and calls it $x$. The scope of the instance is local, i.e., it only exists inside the block marked with “…”.

Without further ado, let us now look at eff proper.