Motivation for Monads

This is not a monads tutorial! There are lots of those in the world already. The best of those tutorials do a decent job of explaining the what and how of monads, but I haven't come across one that I think really nails the why. This essay is an attempt to fill this gap.

To explain the why of monads, we have to first explain the why of the functional style of programming. The functional style has two widely agreed upon defining characteristics and a third that is controversial.

Functions are first-class and higher-order. Functions can be declared inside other functions

Functions (or references to functions, if you prefer) can be passed around and stored in data structures just like any other kind of value. Implicit effects are very strongly discouraged. Fancy type systems are important (this is the controversial one).

I'm going to focus on #2 in this essay, but #1 will be important for getting us out of a jam later. Also we'll see at the end one reason why #3 is controversial.

So what does it mean to not use implicit effects? Functional-style functions are actually mathematical functions. This means that everything a function "reads" is passed as an explicit input and the only thing a function is allowed to "do" is explicitly return outputs. This is in contrast to the much more common procedural style, where functions/ procedures/ methods are allowed to interact with their environment in ways that are not direct parameters or return values.

Some common examples:

Random number generators. In many random number generation libraries there's a procedure called nextNumber (or similar) that takes no parameters and returns a number. Behind the scenes it is accessing the state of the number generator, but that is not evident in its interface at all.

Indexing. Indexable structures (like vectors) generally have a get procedure that takes a vector and a number and returns the value at the specified index. Except the index might be outside of the valid range for that vector, in which case the procedure throws an exception.

System I/O. Lots of procedures interact with the system by reading and writing files, sending and receiving network messages, etc.

In the purely functional style none of these "side effects" is permitted. Does that mean functional-style functions can't do these kinds of things? Of course not! What it means is that the effects must be made explicit inputs and/or outputs.

The random number generator is the simplest example to deal with. A functional-style random number generator function takes the current state of the generator as input and returns both a number and the next state for the generator. It is the responsibility of the client code to "thread" the state of the generator from one call to the next.

Exceptions are not allowed in functional-style code, so we have to encode failure explicitly in the return value. Algebraic data types like Haskell's Maybe and ML's option are awfully convenient for this sort of thing, but not necessary.

I/O is the weirdest of the bunch. How do we reconcile functional style with the need to interact with the world? By inventing an opaque object that represents the state of the world! Conceptually you can imagine functions that do I/O taking a "world token" and passing it on to the next function that needs to do I/O.

WHY???

Many programmers don't get the appeal of the functional style. I'm not going to do too much advocacy here; there's plenty of that on the internet. I'll just point out one compelling reason for using the functional style: testing. Anyone who has worked on testing a complex application knows that setting up the state/context just right to trigger a particular bug can be enormously time consuming. Programming in a functional style means that everything that a function depends on is encoded in the arguments that you pass to it. There are no hidden contextual dependencies. That makes setting up tests at least a bit easier.

And Now, the Why of Monads

Okay. That's my whirlwind intro to the functional style. Now on to monads. As you may have noticed in my example above, transitioning from a procedural style to a functional style means encoding more stuff in function inputs and outputs. In some cases a lot more stuff. That can be a royal pain in the neck. It takes all the plumbing of state and effect sequencing that is implicit in the procedural style and throws it in your face. Only the most die-hard functional zealot would argue that the exposed pipes style is an unambiguously good thing.

Monads (and their cousins) exist to take all that exposed plumbing and cover it back up. So we end up with code that looks a bit closer to the procedural code we started with. However, there's a big difference! The explicit state threading is still happening, it's just conveniently prettied up a bit.

The exceptions versus explicit error returns is a good example. Let's assume we want a function that adds two numbers. Except we're not sure that the numbers will actually be there or not. In Java we might write something like:

int try_to_add_numbers ( Integer a , Integer b ) { return a + b ; }

If a and b are non-null this procedure will return their sum. If either of them is null we'll get a NullPointerException that the client will be expected to handle.

In a pipes-exposed functional style we might get something like this (OCaml syntax):

let try_to_add_numbers ( a , b ) = match ( a , b ) with ( Some an , Some bn ) -> Some ( an + bn ) | _ -> None

This code is explicitly checking whether the parameters are "non-null" and explicitly returning either "None" or the answer wrapped in "Some". While there's something to be said for exposing error condition handling, there is quite a bit of syntactic overhead in this code. The core piece of functionality (an + bn) gets lost.

In Haskell we could write:

try_to_add_numbers a b = liftM2 ( + ) a b

The + function here expects to only be given honest-to-goodness numbers. It's liftM2 that does the magic. It's where the higher-order function business that I mentioned at the beginning comes in. liftM2 takes a function of two arguments and returns a new function of two arguments with some plumbing added. In this case it makes it so that if either input is "Nothing" (Haskell's spelling of "null") the whole function will evaluate to Nothing, and if both have a numerical value it will compute the sum and wrap up the result in "Just" (Haskell's spelling of non-null).

I'm not going to describe the implementation of liftM2. There are plenty of tutorials that can explain the gory details. To close I just want to offer an explanation for why there is an ongoing rift between the dynamic functional style advocates (descendants of Lisp) and the fancy type system functional style advocates (descendants of ML).

The types of liftM2 and it's cousins are fairly sophisticated. If one wanted to do this kind of programming in, say, Java, it's likely that the types you would have to write (and read) would be horrifically long and byzantine. The fancy type system languages (Haskell, Scala, OCaml, …) mitigate this problem with type inference, which makes it possible to leave out most type annotations.

At the other extreme, the dynamic languages avoid the problem of complicated types entirely, by not having types. (Or having a single degenerate type if you want to be really picky.)

So the tension around types comes from the fact that if you want to program in a style that turns effects into explicit things that get passed around, sophisticated higher-order functions are necessary to keep your code reasonably clean. And sophisticated higher-order functions tend to have fancy complicated types. To make programming with them palatable, you either need to have a correspondingly fancy type system or punt on types entirely.

Benjamin Ylvisaker, October 2014