In the previous articles in this series, I covered the conceptual basis of monads, but the discussion was pretty abstract. Now that (I hope) you have a pretty good idea of what monads are and what they're for, it's time to get into the nitty-gritty details of how to actually derive the definitions of specific monads. That means that we are going to have to define the correct instances of the type class Monad for the various notions of computation we talked about previously. In our derivations, we will use our understanding of how monadic composition should work in each particular case to give us the definition of monadic apply (the >>= operator), and we'll use the monad laws to give us the definition of return .

The Maybe monad

The Maybe monad is often the first monad shown to new Haskell programmers in tutorials, because it's very simple to use, to implement, and to understand. To start with, let's look at the definition of the Maybe data type:

data Maybe a = Nothing | Just a

This states that Maybe is a type constructor that acts on a particular type a to give a (concrete) data type. We usually describe this by saying that Maybe is a "polymorphic" data type, but the meaning is the same. So if a was Int , we'd have

data Maybe Int = Nothing | Just Int

except that we don't have to define this directly, as the one definition given above suffices for all types.

A value of type Maybe a is either there or not there; if it's Nothing , it's "not there" and if it's Just x for some value x , it's "just" the value x . One way to think of this is as a container that either has zero or one elements. (Remember that I said previously that monadic values were sometimes described, misleadingly, as containers. This is a case in point.)

The Maybe polymorphic type is useful because we can use it to model a kind of "extended function" which can either produce a value as its output or fail to produce any output (i.e. a function that can fail). This can be written as follows:

f :: a -> Maybe b

A function f of this type takes in an input value of type a , and either returns Nothing (which indicates failure) or a value Just x where x has type b . Functions like f will be the ones that will be used in the Maybe monad, and composing two functions of this kind will look like this:

f :: a -> Maybe b -- assume it's defined elsewhere g :: b -> Maybe c -- assume it's defined elsewhere h :: a -> Maybe c -- monadic composition of f and g h = f >=> g -- recall: >=> is the monadic composition operator

We said that all monads must be type constructors. Maybe is a type constructor, so it qualifies in that sense. But in order to make Maybe into a monad, we have to make it an instance of the type class Monad , which means that we have to fill out this definition:

instance Monad Maybe where (>>=) = {- definition of >>= for Maybe -} return = {- definition of return for Maybe -}

How do we define (>>=) and return for Maybe ?

First, let's write a skeleton definition for the >>= operator, covering the two possible cases of a left operand of type Maybe a :

Nothing >>= f = {- to be defined -} Just x >>= f = {- to be defined -}

where x has type a . It would also be legal to write the left-hand side of this definition as:

(>>=) Nothing f = {- to be defined -} (>>=) (Just x) f = {- to be defined -}

but it looks better if we define it the way we use it, as an operator, and Haskell allows us to do this.

To complete this definition, consider how we want monadic composition to work in the Maybe monad. Let's take our example above with f and g composing monadically to make h :

f :: a -> Maybe b g :: b -> Maybe c h :: a -> Maybe c h = f >=> g

If we pass an argument to f and it returns Nothing (i.e. it fails), then what should h return on the same output?

f x = Nothing h x = (f >=> g) x = ???

It seems apparent that if f x is Nothing , h x must also be Nothing , because if the first function ( f ) in the definition of h fails to produce an output, then h as a whole must also fail to produce an output. The only way h could produce an output is if f produced an output (call it y ), passed it to g , and g y also produced an output. If either f or g fails, then h fails too i.e. h x will evaluate to Nothing .

Putting this into our definition of h , we have:

h = f >=> g h = \x -> (f x >>= g) -- equivalent (from the definition of >=>) h x = f x >>= g -- equivalent -- assume f x == Nothing h x = (Nothing >>= g) = Nothing -- therefore: Nothing >>= g = Nothing

So we know how the (>>=) operator will work with an input of Nothing — it will just return Nothing :

Nothing >>= f = Nothing Just x >>= f = {- to be defined -}

Note that I changed g to f here, which is fine because the name of the function doesn't matter in the definition. In fact, where possible we leave off the name of the function entirely and use the _ wildcard operator as follows:

Nothing >>= _ = Nothing

We can't do this for the second equation because we'll need the function f in the definition.

Now let's look at the other case. If f x does not fail, it will produce an output of the form Just y for some result value y . We need to "unpack" a value from Just y to get the value y , which we will give as the argument to g , and g y will be what h x evaluates to:

h = f >=> g h = \x -> (f x >>= g) -- equivalent (from the definition of >=>) h x = f x >>= g -- equivalent -- assume f x == Just y h x = (Just y >>= g) = g y

This gives us the second part of our definition:

Nothing >>= f = Nothing Just x >>= f = f x

Note that I changed y to x and g to f in the definition. Again, the variable and function names don't matter as long as you're consistent.

That completes the definition of the >>= operator for the Maybe monad. Now we need to define the return function for this monad:

return x = ???

for an arbitrary value x . What possibilities have we got? We could just say that

return x = Nothing

for any x . If this were the case, however, we would violate our monad laws:

return x >>= f == f x Nothing >>= f == f x Nothing == f x -- WRONG!

Since we can assume that at least some f x values are not Nothing (for instance, consider the monadic function f x = Just x ), this can't be correct. The obvious alternative is:

return x = Just x

which obeys the monad laws:

return x >>= f = (Just x) >>= f -- definition of return for Maybe monad = f x -- definition of >>= for Maybe monad -- correct according to monad law 1 Just x >>= return = return x -- definition of >>= for Maybe monad = Just x -- definition of return for Maybe monad -- correct according to monad law 2 Nothing >>= return = Nothing -- definition of >>= for Maybe monad -- correct according to monad law 2

This works, so let's use it. The complete definition of the Maybe monad is thus:

instance Monad Maybe where return x = Just x Nothing >>= f = Nothing Just x >>= f = f x

Woo hoo! We've just derived our first monad!

Just to be on the safe side, let's check that our definition obeys monad law 3, which is:

(mv >>= f) >>= g == mv >>= (\x -> (f x >>= g))

First let's check that it's correct when mv is Nothing :

(Nothing >>= f) >>= g -- left hand side = Nothing >>= g -- definition of >>= = Nothing -- definition of >>= Nothing >>= (\x -> (f x >>= g)) -- right hand side = Nothing -- definition of >>=

OK, that checks out. Now let's see if it works when mv is Just v for some value v :

((Just v) >>= f) >>= g -- left hand side = f v >>= g -- definition of >>= (Just v) >>= (\x -> (f x >>= g)) -- right hand side = (\x -> (f x >>= g)) v -- definition of >>= = f v >>= g -- normal function application (beta reduction)

so this checks out too. So it works! This is the correct definition of the Maybe monad! And the audience goes wild!

What's the significance of this? It means that we can compose a bunch of monadic functions in the Maybe monad easily. And why, you will certainly be asking yourself, is that important? It's not hard to imagine having a bunch of Maybe monadic functions i.e. functions that can fail. Let's say that they all have type Int -> Maybe Int . Here are three such functions:

f :: Int -> Maybe Int f x = if x `mod` 2 == 0 then Nothing else Just (2 * x) g :: Int -> Maybe Int g x = if x `mod` 3 == 0 then Nothing else Just (3 * x) h :: Int -> Maybe Int h x = if x `mod` 5 == 0 then Nothing else Just (5 * x)

We'd like to compose them to get a final function:

k :: Int -> Maybe Int

which is the result of applying f , then g , then h one after another, with Nothing returned if any of them failed. All this does is multiply the input number by 30 unless the number is a multiple of 2, 3, or 5 (in which case it fails i.e. returns Nothing ).

If you've followed the previous discussion it should be clear that you can define k using monadic composition as follows:

k = f >=> g >=> h

Or, if you want to use the >>= operator instead:

k x = f x >>= g >>= h

Or, if you like the do notation:

k x = do y <- f x z <- g y h z

Any way you slice it, though, it's pretty simple.

Now, it's perfectly possible to define h without using monads at all. It looks like this:

k x = case f x of Nothing -> Nothing Just y -> case g y of Nothing -> Nothing Just z -> h z

This shows why the Maybe monad is important: it drastically cuts down on the boilerplate code we have to write to chain Maybe -producing functions together. Imagine how grungy the non-monadic code would be if there were ten Maybe monadic functions in a row being chained together — it would be indented so far to the left that you wouldn't be able to read it, and the overall structure of the computation would be lost in a maze of case statements. But with monads, you could write it like this:

f11 = f1 >=> f2 >=> f3 >=> f4 >=> f5 >=> f6 >=> f7 >=> f8 >=> f9 >=> f10

or (using >>= ):

f11 x = f1 x >>= f2 >>= f3 >>= f4 >>= f5 >>= f6 >>= f7 >>= f8 >>= f9 >>= f10

So monads have made composition of monadic functions just as easy as composition of regular (non-monadic) functions is.

The Maybe monad is very useful for illustrating basic concepts, but it may be confusing in one sense: many people mistakenly believe that the only reason for having monads in Haskell is to handle non-functional computations i.e. ones that do (file or terminal) I/O, alter global variables, etc. And yet, here I showed you a monadic computation which can be done perfectly well without monads. In this case, monads are not essential; they're just very convenient. And that's why I said that even though the original reason for adding monads to Haskell had to do with dealing with inherently non-functional kinds of computations (like computations involving I/O), they turned out to have a far greater applicability. That's why they're neat.

Now, on to our next monad.

The list monad

If you liked the Maybe monad, you're going to love the list monad ;-) We're going to be filling in this instance definition:

instance Monad [] where (>>=) = {- definition of >>= for lists -} return = {- definition of return for lists -}

Note that we use the empty list [] to represent the list type constructor. This is a bit of a hack (lists have special syntactic support in Haskell). Whatever.

As with all monads, the first task is to be clear about what the monad's monadic function represents. For the list monad, a monadic function f looks like this:

f :: a -> [b]

(where, of course, the notation [b] means "list of b "). Recall that the generic definition of a monadic function is this:

f :: a -> m b

for some monad m which has to be a type constructor. Lists are a plausible candidate for a monad because "list of" is a type constructor (even though the syntax is hard-wired into Haskell); if we wanted to, we could even define a list datatype ourselves:

data List a = Nil | Cons a (List a)

and then the monadic function would look like this:

f :: a -> List b

We'll stick to the standard list syntax from now on.

What do functions of this sort represent? The normal way of thinking about them is that they take in an input value of type a , and produce a bunch of output values of type b , collected in one convenient container (the list). (Here again, we have a monad that seems to be a container.) A different way of thinking about functions of this type is that they represent functions with multiple return values i.e. functions which can somehow return a bunch of different values "all at once". (I don't want to say "in parallel" because that implies simultaneous processing which isn't going on here.) The multiple return values are just the elements of the list. This turns out to be a useful perspective when you have multiple functions of this type, for instance:

f :: Int -> [Int] g :: Int -> [Int]

Here, f and g both take in a single Int as an argument and return a bunch of Int s as results. What if we wanted to take every result from f and apply g individually to each result, collecting all the results together? It would be great if we could do this directly, without having to unpack every element from the list of results that f returns and apply g to each one. This is what the list monad will allow us to do.

Let's flesh this out a bit with some example functions:

f :: Int -> [Int] f x = [x-1, x, x+1] g :: Int -> [Int] g x = [-x, x]

How would we "compose" these two functions? f x returns a list, so to apply g to each element of the list we will need the map function:

f 10 --> [9, 10, 11] map g (f 10) --> [[-9, 9], [-10, 10], [-11, 11]]

This new result may be interesting, but it can't be the composition of f and g because it doesn't have the same type (it's a list of lists of Int s, not a list of Int s). To turn it into a simple list of Int s, we can flatten the list of lists using the concat function (which just concatenates a list of lists into a single list):

-- N.B. concat has the type [[a]] -> [a] concat (map g (f 10)) --> [-9, 9, -10, 10, -11, 11]

What this result represents is the collection of all the results obtained by applying f to an integer input and then applying g to one of the results. If you think of f and g as functions which produce multiple results "all at once", then this result is just the collection of all possible results of applying f and then g . Diagrammatically, we can represent this as:

g | -9 | 9 ----> | | | 9 | f | g | -10 10 ----> | 10 ----> | | | 10 | | g | -11 | 11 ----> | | 11

This shows that the composition of f and g is the collection of all paths through the f and g functions.

Interestingly enough, we've just determined what the >>= operator has to be for the list monad! It's defined as follows:

-- mv :: [a] -- g :: a -> [b] mv >>= g = concat (map g mv)

where mv is a monadic value in the list monad (which is just a list of values of type a ). In the previous example, mv is just the result of evaluating f 10 and g is as it was before. This definition even works if mv is the empty list [] , since mapping a function over the empty list just returns the empty list, and concat applied to the empty list returns the empty list as well. So the definition of the >>= operator for this monad is very simple.

[Note for GHC fans: The GHC Haskell compiler actually implements >>= for the list monad in a slightly different way which is (I believe) more efficient but which gives the same results.]

How would we define return in this monad? Let's think of the list monadic value as an "action" returning various values. Let's also recall that return should be the equivalent of the identity function in any given monad. What would be the equivalent of the identity function in the list monad? It would take an input value and return an "action" which simply returns that value when "executed". So we know that return can't simply return the empty list, for instance. A reasonable possibility is to define return as follows:

return :: a -> [b] return x = [x]

In other words, return just creates a list with a single value. Let's see if this is plausible by seeing if it obeys the monad laws:

-- f :: a -> [b] -- x :: a return x >>= f = concat (map f (return x)) -- definition of >>= = concat (map f [x]) -- definition of return = concat [f x] -- definition of map = f x -- definition of concat -- correct according to monad law 1 -- mv :: [a] mv >>= return = concat (map return mv) -- definition of >>= = concat (map (\x -> [x]) mv) -- definition of return -- Two cases: -- Case 1: mv == [] = concat (map (\x -> [x]) []) -- definition of mv = concat [] -- definition of map = [] -- definition of concat = mv -- definition of mv -- Case 2: mv == [v1, v2, ...] = concat (map (\x -> [x]) [v1, v2, ...]) -- definition of mv = concat [[v1], [v2], ...] -- definition of map = [v1, v2, ...] -- definition of concat = mv -- definition of mv -- correct according to monad law 2

OK, this works. You might want to try other possible definitions for return in the list monad (e.g. where return returns a list which contains 0, 2, 3, or an infinite number of copies of its argument), and find out why they violate the monad laws. This is a good way to get practice using the monad laws.

To prove that our list monad is really a monad, we still have to show that monad law 3 is obeyed. This will be a lot harder, but let's do it anyway. It'll turn out to be easier to do if we use the nice version of monad law 3 (the one involving monadic composition). To start, we need to define monadic composition in the list monad:

-- Monad law 3 (nice version): (f >=> g) >=> h = f >=> (g >=> h) -- By definition: f >=> g = \x -> f x >>= g -- In the list monad, using the definition of >>= for lists: f >=> g = \x -> concat (map g (f x)) -- Using the composition operator (.), this can be written as: f >=> g = concat . map g . f

In addition, I'll be using several identities involving map and concat applied to lists. You should just take these on faith for now, though I'll show how to derive them below.

-- equation 1: map (f . g) = map f . map g -- equation 2: map f . concat = concat . map (map f) -- equation 3: concat . concat = concat . map concat

As I mentioned above, . refers to the (pure) function composition operator. It has lower precedence than function application (via concatenation), so an expression like map f . map g actually means (map f) . (map g) . Haskell programmers usually prefer to leave out unnecessary parentheses when possible. It's also important to note that e.g. map f is a function even though map normally takes two arguments (a function and a list to map the function over). If you recall what I said about currying previously, then you'll understand that map f is a function which takes a list as its only argument and returns a new list, which what you get when you apply the function f to all the elements of the original list and collect the results into a new list. We'll be using currying a lot below.

Given all that, here's the derivation:

(f >=> g) >=> h = (concat . map g . f) >=> h -- definition of >=> = concat . map h . (concat . map g . f) -- definition of >=> = concat . map h . concat . map g . f -- remove unnecessary parentheses f >=> (g >=> h) = f >=> (concat . map h . g) -- definition of >=> = concat . map (concat . map h . g) . f -- definition of >=> = concat . map ((concat . map h) . g) . f -- equivalent = concat . (map (concat . map h)) . (map g) . f -- equation 1 = concat . map (concat . map h) . map g . f -- remove unnecessary parentheses = concat . map concat . map (map h) . map g . f -- equation 1

So we need to show that:

concat . map h . concat . map g . f = concat . map concat . map (map h) . map g . f

which will be true if we can show that:

concat . map h . concat = concat . map concat . map (map h)

Let's do that:

-- add parentheses for clarity: concat . (map h . concat) = concat . map concat . map (map h) -- use equation 2: concat . concat . map (map h) = concat . map concat . map (map h) -- add parentheses for clarity: (concat . concat) . map (map h) = concat . map concat . map (map h) -- use equation 3: concat . map concat . map (map h) = concat . map concat . map (map h)

and we're done. Whew! In fact, Haskell programmers very rarely do derivations of this kind, but they are necessary to show that a purported monad really is a monad.

ASIDE: Deriving the map / concat identities (equations 1, 2, and 3) Preliminaries : To prove these identities, we will need to first prove a few simpler ones (math is hard!). They are: -- Equation 4: concat (x:xs) = x ++ concat xs -- Equation 5: concat (x ++ y) = concat x ++ concat y -- Equation 6: map f (x ++ y) = map f x ++ map f y Equation 4 follows from the definition of concat . Equation 5 is easily proved by induction on the length of x and using equation 4: -- base case: x is an empty list concat ([] ++ y) = concat [] ++ concat y concat y = [] ++ concat y -- definition of concat [] concat y = concat y -- definition of ++ -- OK -- inductive case: x is non-empty; call its parts x1 (head) and xs (tail) concat ((x1:xs) ++ y) = concat (x1 : (xs ++ y)) -- definition of ++ = x1 ++ concat (xs ++ y) -- equation 4 = x1 ++ concat xs ++ concat y -- inductive hypothesis concat (x1:xs) ++ concat y = x1 ++ concat xs ++ concat y -- equation 4 -- OK, so Q.E.D. Equation 6 is proved similarly: -- base case: x is an empty list map f ([] ++ y) = map f [] ++ map f y map f y = [] ++ map f y map f y = map f y -- OK -- inductive case: x is non-empty; call its parts x1 (head) and xs (tail) map f (x ++ y) = map f ((x1:xs) ++ y) = map f (x1 : (xs ++ y)) -- definition of ++ = f x1 : map f (xs ++ y) -- definition of map = f x1 : (map f xs ++ map f y) -- inductive hypothesis = (f x1 : map f xs) ++ map f y -- definition of ++ = map f (x1:xs) ++ map f y -- definition of map = map f x ++ map f y -- definition of x -- OK, so Q.E.D. With that out of the way, let's get to the proofs of equations 1, 2, and 3. Equation 1 : map (f . g) = map f . map g Use induction on the length of the implicit list argument on both sides, as well as the definition of map : -- base case: empty list map (f . g) [] = [] (map f . map g) [] = map f (map g []) = map f [] = [] -- OK -- inductive case: non-empty list map (f . g) (x:xs) = ((f . g) x) : (map (f . g) xs) -- definition of map = (f (g x)) : (map (f . g) xs) -- definition of . (map f . map g) (x:xs) = map f (map g (x:xs)) -- definition of . = map f ((g x) : (map g xs)) -- definition of map = (f (g x)) : (map f (map g xs)) -- definition of map = (f (g x)) : ((map f . map g) xs) -- definition of . = (f (g x)) : (map (f . g) xs) -- inductive hypothesis -- OK, so Q.E.D. Equation 2 : map f . concat = concat . map (map f) Use induction on the length of the list argument: -- base case: empty list (map f . concat) [] = map f (concat []) = map f [] = [] (concat . map (map f)) [] = concat (map (map f) []) = concat [] = [] -- OK -- inductive case: non-empty list (map f . concat) (x:xs) = map f (concat (x:xs)) -- definition of . = map f (x ++ concat xs) -- equation 4 = map f x ++ (map f (concat xs)) -- equation 6 = map f x ++ ((map f . concat) xs) -- definition of . = map f x ++ ((concat . map (map f)) xs) -- inductive hypothesis = map f x ++ concat (map (map f) xs) -- definition of . (concat . map (map f)) (x:xs) = concat (map (map f) (x:xs)) -- definition of . = concat (map f x : map (map f) xs) -- definition of map = map f x ++ concat (map (map f) xs) -- equation 4 -- OK, so Q.E.D. Equation 3 : concat . concat = concat . map concat As usual, use induction on the length of the list argument: -- base case: empty list (concat . concat) [] = concat (concat []) = concat [] = [] (concat . map concat) [] = concat (map concat []) = concat [] = [] -- OK -- inductive case: non-empty list (concat . concat) (x:xs) = concat (concat (x:xs)) -- definition of . = concat (x ++ concat xs) -- equation 4 = concat x ++ concat (concat xs) -- equation 5 (concat . map concat) (x:xs) = concat (map concat (x:xs)) -- definition of . = concat (concat x : map concat xs) -- definition of map = concat x ++ concat (map concat xs) -- equation 4 = concat x ++ (concat . map concat) xs -- definition of . = concat x ++ (concat . concat) xs -- inductive hypothesis = concat x ++ concat (concat xs) -- definition of . -- OK, so Q.E.D.

By now, I hope you're fully convinced that the list monad is in fact a monad ;-)

A much more interesting question, of course, is this: what can we do with the list monad that was hard to do before? Here's a simple example: find all pairs of numbers between 1 and 6 that sum to the number 7 (the numbers might represent e.g. dice rolls). With the list monad, this is trivial:

-- Using do-notation: do n1 <- [1..6] n2 <- [1..6] if n1 + n2 == 7 then return (n1, n2) else [] -- Result: [(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)]

This can also be written without the do -notation, though it's less clear:

[1..6] >>=

1 -> [1..6] >>=

2 -> if n1 + n2 == 7 then return (n1, n2) else []

How does this work? You should really sit down and work this out for yourself given the definitions of >>= and return for lists, but here's a hand-wavy explanation. Note that [1..6] is a monadic value in the list monad, so it delivers all of its values to n1 (one at a time, of course), and then again to n2 , and then if the sum is correct all the pairs (n1, n2) will be returned. So we're doing a computation on all elements of a list as easily as if it were computations on only single elements. That's what the list monad buys you.

If you've done any significant amount of Haskell programming, alarm bells are going off in your head right about now. "But wait!" I hear you cry. "Can't you just use a list comprehension to do this?" Indeed you can, and it looks like this:

[(n1, n2) | n1 <- [1..6], n2 <- [1..6], n1 + n2 == 7]

The list monad can in fact do anything that list comprehensions can do. Which syntax one prefers is a matter of taste and may also depend on the particular problem. In fact, Phil Walder, in the paper Comprehending Monads (there's a pun in that title, as there are in most of Phil's paper titles) suggested that the list comprehension syntax be extended to arbitrary monads. That proposal was eventually rejected in favor of the current notation.

The list monad is more than just an alternate version of list comprehensions, though. For one thing, there are many very generic functions that can work on instances of any monad; having a list monad means that those functions can work on lists as well. Furthermore, there is an extension to the Monad type class called MonadPlus which adds extra functionality to monads (specifically, it defines a "zero" element for a monad as well as a way of "adding" two monadic values). Lists turn out to be an instance of MonadPlus as well as of Monad , and this means that generic functions that work on all instances of MonadPlus will also work on lists. (For instance, there is a generalization of the concat function called msum which works for any instance of MonadPlus , including lists.) It's better to have generic functions that can work on large numbers of datatypes than to have to define new functions for each particular datatype, so this is a clear win.

Next time

In the next installment I'll look at monads to deal with error handling.