From HaskellWiki

By Brent Yorgey, byorgey@gmail.com

Originally published 12 March 2009 in issue 13 of the Monad.Reader. Ported to the Haskell wiki in November 2011 by Geheimdienst.

This is now the official version of the Typeclassopedia and supersedes the version published in the Monad.Reader. Please help update and extend it by editing it yourself or by leaving comments, suggestions, and questions on the talk page.

Abstract

The standard Haskell libraries feature a number of type classes with algebraic or category-theoretic underpinnings. Becoming a fluent Haskell hacker requires intimate familiarity with them all, yet acquiring this familiarity often involves combing through a mountain of tutorials, blog posts, mailing list archives, and IRC logs.

The goal of this document is to serve as a starting point for the student of Haskell wishing to gain a firm grasp of its standard type classes. The essentials of each type class are introduced, with examples, commentary, and extensive references for further reading.

Introduction

Have you ever had any of the following thoughts?

What the heck is a monoid, and how is it different from a mon a d?

I finally figured out how to use Parsec with do-notation, and someone told me I should use something called Applicative instead. Um, what?

Someone in the #haskell IRC channel used (***) , and when I asked Lambdabot to tell me its type, it printed out scary gobbledygook that didn’t even fit on one line! Then someone used fmap fmap fmap and my brain exploded.

When I asked how to do something I thought was really complicated, people started typing things like zip.ap fmap.(id &&& wtf) and the scary thing is that they worked! Anyway, I think those people must actually be robots because there’s no way anyone could come up with that in two seconds off the top of their head.

If you have, look no further! You, too, can write and understand concise, elegant, idiomatic Haskell code with the best of them.

There are two keys to an expert Haskell hacker’s wisdom:

Understand the types. Gain a deep intuition for each type class and its relationship to other type classes, backed up by familiarity with many examples.

It’s impossible to overstate the importance of the first; the patient student of type signatures will uncover many profound secrets. Conversely, anyone ignorant of the types in their code is doomed to eternal uncertainty. “Hmm, it doesn’t compile ... maybe I’ll stick in an fmap here ... nope, let’s see ... maybe I need another (.) somewhere? ... um ...”

The second key—gaining deep intuition, backed by examples—is also important, but much more difficult to attain. A primary goal of this document is to set you on the road to gaining such intuition. However—

There is no royal road to Haskell. —Euclid

This document can only be a starting point, since good intuition comes from hard work, not from learning the right metaphor. Anyone who reads and understands all of it will still have an arduous journey ahead—but sometimes a good starting point makes a big difference.

It should be noted that this is not a Haskell tutorial; it is assumed that the reader is already familiar with the basics of Haskell, including the standard Prelude , the type system, data types, and type classes.

The type classes we will be discussing and their interrelationships (source code for this graph can be found here):

∗ Apply can be found in the semigroupoids package, and Comonad in the comonad package.

Solid arrows point from the general to the specific; that is, if there is an arrow from Foo to Bar it means that every Bar is (or should be, or can be made into) a Foo .

point from the general to the specific; that is, if there is an arrow from to it means that every is (or should be, or can be made into) a . Dotted lines indicate some other sort of relationship.

indicate some other sort of relationship. Monad and ArrowApply are equivalent.

and are equivalent. Apply and Comonad are greyed out since they are not actually (yet?) in the standard Haskell libraries ∗ .

One more note before we begin. The original spelling of “type class” is with two words, as evidenced by, for example, the Haskell 2010 Language Report, early papers on type classes like Type classes in Haskell and Type classes: exploring the design space, and Hudak et al.’s history of Haskell. However, as often happens with two-word phrases that see a lot of use, it has started to show up as one word (“typeclass”) or, rarely, hyphenated (“type-class”). When wearing my prescriptivist hat, I prefer “type class”, but realize (after changing into my descriptivist hat) that there's probably not much I can do about it.

Instances of List and Maybe illustrates these type classes with simple examples using List and Maybe. We now begin with the simplest type class of all: Functor .

Functor

The Functor class (haddock) is the most basic and ubiquitous type class in the Haskell libraries. A simple intuition is that a Functor represents a “container” of some sort, along with the ability to apply a function uniformly to every element in the container. For example, a list is a container of elements, and we can apply a function to every element of a list, using map . As another example, a binary tree is also a container of elements, and it’s not hard to come up with a way to recursively apply a function to every element in a tree.

Another intuition is that a Functor represents some sort of “computational context”. This intuition is generally more useful, but is more difficult to explain, precisely because it is so general. Some examples later should help to clarify the Functor -as-context point of view.

In the end, however, a Functor is simply what it is defined to be; doubtless there are many examples of Functor instances that don’t exactly fit either of the above intuitions. The wise student will focus their attention on definitions and examples, without leaning too heavily on any particular metaphor. Intuition will come, in time, on its own.

Definition

Here is the type class declaration for Functor :

class Functor f where fmap :: ( a -> b ) -> f a -> f b ( <$ ) :: a -> f b -> f a ( <$ ) = fmap . const

Functor is exported by the Prelude , so no special imports are needed to use it. Note that the (<$) operator is provided for convenience, with a default implementation in terms of fmap ; it is included in the class just to give Functor instances the opportunity to provide a more efficient implementation than the default. To understand Functor , then, we really need to understand fmap .

First, the f a and f b in the type signature for fmap tell us that f isn’t a concrete type like Int ; it is a sort of type function which takes another type as a parameter. More precisely, the kind of f must be * -> * . For example, Maybe is such a type with kind * -> * : Maybe is not a concrete type by itself (that is, there are no values of type Maybe ), but requires another type as a parameter, like Maybe Integer . So it would not make sense to say instance Functor Integer , but it could make sense to say instance Functor Maybe .

Now look at the type of fmap : it takes any function from a to b , and a value of type f a , and outputs a value of type f b . From the container point of view, the intention is that fmap applies a function to each element of a container, without altering the structure of the container. From the context point of view, the intention is that fmap applies a function to a value without altering its context. Let’s look at a few specific examples.

Finally, we can understand (<$) : instead of applying a function to the values a container/context, it simply replaces them with a given value. This is the same as applying a constant function, so (<$) can be implemented in terms of fmap .

Instances

∗ Recall that [] has two meanings in Haskell: it can either stand for the empty list, or, as here, it can represent the list type constructor (pronounced “list-of”). In other words, the type [a] (list-of- a ) can also be written [] a .

∗ You might ask why we need a separate map function. Why not just do away with the current list-only map function, and rename fmap to map instead? Well, that’s a good question. The usual argument is that someone just learning Haskell, when using map incorrectly, would much rather see an error about lists than about Functor s.

As noted before, the list constructor [] is a functor ∗; we can use the standard list function map to apply a function to each element of a list ∗. The Maybe type constructor is also a functor, representing a container which might hold a single element. The function fmap g has no effect on Nothing (there are no elements to which g can be applied), and simply applies g to the single element inside a Just . Alternatively, under the context interpretation, the list functor represents a context of nondeterministic choice; that is, a list can be thought of as representing a single value which is nondeterministically chosen from among several possibilities (the elements of the list). Likewise, the Maybe functor represents a context with possible failure. These instances are:

instance Functor [] where fmap :: ( a -> b ) -> [ a ] -> [ b ] fmap _ [] = [] fmap g ( x : xs ) = g x : fmap g xs -- or we could just say fmap = map instance Functor Maybe where fmap :: ( a -> b ) -> Maybe a -> Maybe b fmap _ Nothing = Nothing fmap g ( Just a ) = Just ( g a )

As an aside, in idiomatic Haskell code you will often see the letter f used to stand for both an arbitrary Functor and an arbitrary function. In this document, f represents only Functor s, and g or h always represent functions, but you should be aware of the potential confusion. In practice, what f stands for should always be clear from the context, by noting whether it is part of a type or part of the code.

There are other Functor instances in the standard library as well:

Either e is an instance of Functor ; Either e a represents a container which can contain either a value of type a , or a value of type e (often representing some sort of error condition). It is similar to Maybe in that it represents possible failure, but it can carry some extra information about the failure as well.

((,) e) represents a container which holds an “annotation” of type e along with the actual value it holds. It might be clearer to write it as (e,) , by analogy with an operator section like (1+) , but that syntax is not allowed in types (although it is allowed in expressions with the TupleSections extension enabled). However, you can certainly think of it as (e,) .

((->) e) (which can be thought of as (e ->) ; see above), the type of functions which take a value of type e as a parameter, is a Functor . As a container, (e -> a) represents a (possibly infinite) set of values of a , indexed by values of e . Alternatively, and more usefully, ((->) e) can be thought of as a context in which a value of type e is available to be consulted in a read-only fashion. This is also why ((->) e) is sometimes referred to as the reader monad; more on this later.

IO is a Functor ; a value of type IO a represents a computation producing a value of type a which may have I/O effects. If m computes the value x while producing some I/O effects, then fmap g m will compute the value g x while producing the same I/O effects.

Many standard types from the containers library (such as Tree , Map , and Sequence ) are instances of Functor . A notable exception is Set , which cannot be made a Functor in Haskell (although it is certainly a mathematical functor) since it requires an Ord constraint on its elements; fmap must be applicable to any types a and b . However, Set (and other similarly restricted data types) can be made an instance of a suitable generalization of Functor , either by making a and b arguments to the Functor type class themselves, or by adding an associated constraint.

Exercises Implement Functor instances for Either e and ((->) e) . Implement Functor instances for ((,) e) and for Pair , defined as data Pair a = Pair a a Explain their similarities and differences. Implement a Functor instance for the type ITree , defined as data ITree a = Leaf ( Int -> a ) | Node [ ITree a ] Give an example of a type of kind * -> * which cannot be made an instance of Functor (without using undefined ). Is this statement true or false? The composition of two Functor s is also a Functor . If false, give a counterexample; if true, prove it by exhibiting some appropriate Haskell code.

Laws

As far as the Haskell language itself is concerned, the only requirement to be a Functor is an implementation of fmap with the proper type. Any sensible Functor instance, however, will also satisfy the functor laws, which are part of the definition of a mathematical functor. There are two:

fmap id = id fmap ( g . h ) = ( fmap g ) . ( fmap h )

∗ Technically, these laws make f and fmap together an endofunctor on Hask, the category of Haskell types (ignoring ⊥, which is a party pooper). See Wikibook: Category theory.

Together, these laws ensure that fmap g does not change the structure of a container, only the elements. Equivalently, and more simply, they ensure that fmap g changes a value without altering its context ∗.

The first law says that mapping the identity function over every item in a container has no effect. The second says that mapping a composition of two functions over every item in a container is the same as first mapping one function, and then mapping the other.

As an example, the following code is a “valid” instance of Functor (it typechecks), but it violates the functor laws. Do you see why?

-- Evil Functor instance instance Functor [] where fmap :: ( a -> b ) -> [ a ] -> [ b ] fmap _ [] = [] fmap g ( x : xs ) = g x : g x : fmap g xs

Any Haskeller worth their salt would reject this code as a gruesome abomination.

Unlike some other type classes we will encounter, a given type has at most one valid instance of Functor . This can be proven via the free theorem for the type of fmap . In fact, GHC can automatically derive Functor instances for many data types.

∗ Actually, if seq / undefined are considered, it is possible to have an implementation which satisfies the first law but not the second. The rest of the comments in this section should be considered in a context where seq and undefined are excluded.

A similar argument also shows that any Functor instance satisfying the first law ( fmap id = id ) will automatically satisfy the second law as well. Practically, this means that only the first law needs to be checked (usually by a very straightforward induction) to ensure that a Functor instance is valid.∗

Exercises Although it is not possible for a Functor instance to satisfy the first Functor law but not the second (excluding undefined ), the reverse is possible. Give an example of a (bogus) Functor instance which satisfies the second law but not the first. Which laws are violated by the evil Functor instance for list shown above: both laws, or the first law alone? Give specific counterexamples.

Intuition

There are two fundamental ways to think about fmap . The first has already been mentioned: it takes two parameters, a function and a container, and applies the function “inside” the container, producing a new container. Alternately, we can think of fmap as applying a function to a value in a context (without altering the context).

Just like all other Haskell functions of “more than one parameter”, however, fmap is actually curried: it does not really take two parameters, but takes a single parameter and returns a function. For emphasis, we can write fmap ’s type with extra parentheses: fmap :: (a -> b) -> (f a -> f b) . Written in this form, it is apparent that fmap transforms a “normal” function ( g :: a -> b ) into one which operates over containers/contexts ( fmap g :: f a -> f b ). This transformation is often referred to as a lift; fmap “lifts” a function from the “normal world” into the “ f world”.

Utility functions

There are a few more Functor -related functions which can be imported from the Data.Functor module.

(<$>) is defined as a synonym for fmap . This enables a nice infix style that mirrors the ($) operator for function application. For example, f $ 3 applies the function f to 3, whereas f <$> [1,2,3] applies f to each member of the list.

is defined as a synonym for . This enables a nice infix style that mirrors the operator for function application. For example, applies the function to 3, whereas applies to each member of the list. ($>) :: Functor f => f a -> b -> f b is just flip (<$) , and can occasionally be useful. To keep them straight, you can remember that (<$) and ($>) point towards the value that will be kept.

is just , and can occasionally be useful. To keep them straight, you can remember that and point towards the value that will be kept. void :: Functor f => f a -> f () is a specialization of (<$) , that is, void x = () <$ x . This can be used in cases where a computation computes some value but the value should be ignored.

Further reading

A good starting point for reading about the category theory behind the concept of a functor is the excellent Haskell wikibook page on category theory.

Applicative

A somewhat newer addition to the pantheon of standard Haskell type classes, applicative functors represent an abstraction lying in between Functor and Monad in expressivity, first described by McBride and Paterson. The title of their classic paper, Applicative Programming with Effects, gives a hint at the intended intuition behind the Applicative type class. It encapsulates certain sorts of “effectful” computations in a functionally pure way, and encourages an “applicative” programming style. Exactly what these things mean will be seen later.

Definition

Recall that Functor allows us to lift a “normal” function to a function on computational contexts. But fmap doesn’t allow us to apply a function which is itself in a context to a value in a context. Applicative gives us just such a tool, (<*>) (variously pronounced as "apply", "app", or "splat"). It also provides a method, pure , for embedding values in a default, “effect free” context. Here is the type class declaration for Applicative , as defined in Control.Applicative :

class Functor f => Applicative f where pure :: a -> f a infixl 4 <*> , *> , <* ( <*> ) :: f ( a -> b ) -> f a -> f b ( *> ) :: f a -> f b -> f b a1 *> a2 = ( id <$ a1 ) <*> a2 ( <* ) :: f a -> f b -> f a ( <* ) = liftA2 const

Note that every Applicative must also be a Functor . In fact, as we will see, fmap can be implemented using the Applicative methods, so every Applicative is a functor whether we like it or not; the Functor constraint forces us to be honest.

(*>) and (<*) are provided for convenience, in case a particular instance of Applicative can provide more efficient implementations, but they are provided with default implementations. For more on these operators, see the section on Utility functions below.

∗ Recall that ($) is just function application: f $ x = f x .

As always, it’s crucial to understand the type signatures. First, consider (<*>) : the best way of thinking about it comes from noting that the type of (<*>) is similar to the type of ($) ∗, but with everything enclosed in an f . In other words, (<*>) is just function application within a computational context. The type of (<*>) is also very similar to the type of fmap ; the only difference is that the first parameter is f (a -> b) , a function in a context, instead of a “normal” function (a -> b) .

pure takes a value of any type a , and returns a context/container of type f a . The intention is that pure creates some sort of “default” container or “effect free” context. In fact, the behavior of pure is quite constrained by the laws it should satisfy in conjunction with (<*>) . Usually, for a given implementation of (<*>) there is only one possible implementation of pure .

(Note that previous versions of the Typeclassopedia explained pure in terms of a type class Pointed , which can still be found in the pointed package. However, the current consensus is that Pointed is not very useful after all. For a more detailed explanation, see Why not Pointed?)

Laws

∗ See haddock for Applicative and Applicative programming with effects

Traditionally, there are four laws that Applicative instances should satisfy ∗. In some sense, they are all concerned with making sure that pure deserves its name:

The identity law:

pure id <*> v = v

Homomorphism:

pure f <*> pure x = pure ( f x ) pure .

. Interchange:

u <*> pure y = pure ( $ y ) <*> u

Composition:

u <*> ( v <*> w ) = pure ( . ) <*> u <*> v <*> w (<*>) . The reader may wish to simply convince themselves that this law is type-correct.

Considered as left-to-right rewrite rules, the homomorphism, interchange, and composition laws actually constitute an algorithm for transforming any expression using pure and (<*>) into a canonical form with only a single use of pure at the very beginning and only left-nested occurrences of (<*>) . Composition allows reassociating (<*>) ; interchange allows moving occurrences of pure leftwards; and homomorphism allows collapsing multiple adjacent occurrences of pure into one.

There is also a law specifying how Applicative should relate to Functor :

fmap g x = pure g <*> x

It says that mapping a pure function g over a context x is the same as first injecting g into a context with pure , and then applying it to x with (<*>) . In other words, we can decompose fmap into two more atomic operations: injection into a context, and application within a context. Since (<$>) is a synonym for fmap , the above law can also be expressed as:

g <$> x = pure g <*> x .

Exercises (Tricky) One might imagine a variant of the interchange law that says something about applying a pure function to an effectful argument. Using the above laws, prove that pure f <*> x = pure ( flip ( $ )) <*> x <*> pure f

Instances

Most of the standard types which are instances of Functor are also instances of Applicative .

Maybe can easily be made an instance of Applicative ; writing such an instance is left as an exercise for the reader.

The list type constructor [] can actually be made an instance of Applicative in two ways; essentially, it comes down to whether we want to think of lists as ordered collections of elements, or as contexts representing multiple results of a nondeterministic computation (see Wadler’s How to replace failure by a list of successes).

Let’s first consider the collection point of view. Since there can only be one instance of a given type class for any particular type, one or both of the list instances of Applicative need to be defined for a newtype wrapper; as it happens, the nondeterministic computation instance is the default, and the collection instance is defined in terms of a newtype called ZipList . This instance is:

newtype ZipList a = ZipList { getZipList :: [ a ] } instance Applicative ZipList where pure :: a -> ZipList a pure = undefined -- exercise ( <*> ) :: ZipList ( a -> b ) -> ZipList a -> ZipList b ( ZipList gs ) <*> ( ZipList xs ) = ZipList ( zipWith ( $ ) gs xs )

To apply a list of functions to a list of inputs with (<*>) , we just match up the functions and inputs elementwise, and produce a list of the resulting outputs. In other words, we “zip” the lists together with function application, ($) ; hence the name ZipList .

The other Applicative instance for lists, based on the nondeterministic computation point of view, is:

instance Applicative [] where pure :: a -> [ a ] pure x = [ x ] ( <*> ) :: [ a -> b ] -> [ a ] -> [ b ] gs <*> xs = [ g x | g <- gs , x <- xs ]

Instead of applying functions to inputs pairwise, we apply each function to all the inputs in turn, and collect all the results in a list.

Now we can write nondeterministic computations in a natural style. To add the numbers 3 and 4 deterministically, we can of course write (+) 3 4 . But suppose instead of 3 we have a nondeterministic computation that might result in 2 , 3 , or 4 ; then we can write

pure ( + ) <*> [ 2 , 3 , 4 ] <*> pure 4

or, more idiomatically,

( + ) <$> [ 2 , 3 , 4 ] <*> pure 4 .

There are several other Applicative instances as well:

IO is an instance of Applicative , and behaves exactly as you would think: to execute m1 <*> m2 , first m1 is executed, resulting in a function f , then m2 is executed, resulting in a value x , and finally the value f x is returned as the result of executing m1 <*> m2 .

((,) a) is an Applicative , as long as a is an instance of Monoid (section Monoid). The a values are accumulated in parallel with the computation.

The Applicative module defines the Const type constructor; a value of type Const a b simply contains an a . This is an instance of Applicative for any Monoid a ; this instance becomes especially useful in conjunction with things like Foldable (section Foldable).

The WrappedMonad and WrappedArrow newtypes make any instances of Monad (section Monad) or Arrow (section Arrow) respectively into instances of Applicative ; as we will see when we study those type classes, both are strictly more expressive than Applicative , in the sense that the Applicative methods can be implemented in terms of their methods.

Exercises Implement an instance of Applicative for Maybe . Determine the correct definition of pure for the ZipList instance of Applicative —there is only one implementation that satisfies the law relating pure and (<*>) .

Intuition

McBride and Paterson’s paper introduces the notation to denote function application in a computational context. If each has type for some applicative functor , and has type , then the entire expression has type . You can think of this as applying a function to multiple “effectful” arguments. In this sense, the double bracket notation is a generalization of fmap , which allows us to apply a function to a single argument in a context.

Why do we need Applicative to implement this generalization of fmap ? Suppose we use fmap to apply g to the first parameter x1 . Then we get something of type f (t2 -> ... t) , but now we are stuck: we can’t apply this function-in-a-context to the next argument with fmap . However, this is precisely what (<*>) allows us to do.

This suggests the proper translation of the idealized notation into Haskell, namely

g <$> x1 <*> x2 <*> ... <*> xn ,

recalling that Control.Applicative defines (<$>) as convenient infix shorthand for fmap . This is what is meant by an “applicative style”—effectful computations can still be described in terms of function application; the only difference is that we have to use the special operator (<*>) for application instead of simple juxtaposition.

Note that pure allows embedding “non-effectful” arguments in the middle of an idiomatic application, like

g <$> x1 <*> pure x2 <*> x3

which has type f d , given

g :: a -> b -> c -> d x1 :: f a x2 :: b x3 :: f c

The double brackets are commonly known as “idiom brackets”, because they allow writing “idiomatic” function application, that is, function application that looks normal but has some special, non-standard meaning (determined by the particular instance of Applicative being used). Idiom brackets are not supported by GHC, but they are supported by the Strathclyde Haskell Enhancement, a preprocessor which (among many other things) translates idiom brackets into standard uses of (<$>) and (<*>) . This can result in much more readable code when making heavy use of Applicative .

In addition, as of GHC 8, the ApplicativeDo extension enables g <$> x1 <*> x2 <*> ... <*> xn to be written in a different style:

do v1 <- x1 v2 <- x2 ... vn <- xn pure ( g v1 v2 ... vn )

See the Further Reading section below as well as the discussion of do-notation in the Monad section for more information.

Utility functions

Control.Applicative provides several utility functions that work generically with any Applicative instance.

liftA :: Applicative f => (a -> b) -> f a -> f b . This should be familiar; of course, it is the same as fmap (and hence also the same as (<$>) ), but with a more restrictive type. This probably exists to provide a parallel to liftA2 and liftA3 , but there is no reason you should ever need to use it.

liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c lifts a 2-argument function to operate in the context of some Applicative . When liftA2 is fully applied, as in liftA2 f arg1 arg2 ,it is typically better style to instead use f <$> arg1 <*> arg2 . However, liftA2 can be useful in situations where it is partially applied. For example, one could define a Num instance for Maybe Integer by defining (+) = liftA2 (+) and so on.

There is a liftA3 but no liftAn for larger n .

(*>) :: Applicative f => f a -> f b -> f b sequences the effects of two Applicative computations, but discards the result of the first. For example, if m1, m2 :: Maybe Int , then m1 *> m2 is Nothing whenever either m1 or m2 is Nothing ; but if not, it will have the same value as m2 .

Likewise, (<*) :: Applicative f => f a -> f b -> f a sequences the effects of two computations, but keeps only the result of the first, discarding the result of the second. Just as with (<$) and ($>) , to keep (<*) and (*>) straight, remember that they point towards the values that will be kept.

(<**>) :: Applicative f => f a -> f (a -> b) -> f b is similar to (<*>) , but where the first computation produces value(s) which are provided as input to the function(s) produced by the second computation. Note this is not the same as flip (<*>) , because the effects are performed in the opposite order. This is possible to observe with any Applicative instance with non-commutative effects, such as the instance for lists: (<**>) [1,2] [(+5),(*10)] produces a different result than (flip (<*>)) on the same arguments.

when :: Applicative f => Bool -> f () -> f () conditionally executes a computation, evaluating to its second argument if the test is True , and to pure () if the test is False .

unless :: Applicative f => Bool -> f () -> f () is like when , but with the test negated.

The guard function is for use with instances of Alternative (an extension of Applicative to incorporate the ideas of failure and choice), which is discussed in the section on Alternative and friends.

Exercises Implement a function sequenceAL :: Applicative f => [ f a ] -> f [ a ] sequenceA , which works for any Traversable (see the later section on Traversable), but implementing this version specialized to lists is a good exercise.

Alternative formulation

An alternative, equivalent formulation of Applicative is given by

class Functor f => Monoidal f where unit :: f () ( ** ) :: f a -> f b -> f ( a , b )

∗ In category-theory speak, we say f is a lax monoidal functor because there aren't necessarily functions in the other direction, like f (a, b) -> (f a, f b) . Intuitively, this states that a monoidal functor∗ is one which has some sort of "default shape" and which supports some sort of "combining" operation. pure and (<*>) are equivalent in power to unit and (**) (see the Exercises below). More technically, the idea is that f preserves the "monoidal structure" given by the pairing constructor (,) and unit type () . This can be seen even more clearly if we rewrite the types of unit and (**) as

unit' :: () -> f () ( ** ' ) :: (f a, f b) -> f (a, b)

Furthermore, to deserve the name "monoidal" (see the section on Monoids), instances of Monoidal ought to satisfy the following laws, which seem much more straightforward than the traditional Applicative laws:

∗ In this and the following laws, ≅ refers to isomorphism rather than equality. In particular we consider (x,()) ≅ x ≅ ((),x) and ((x,y),z) ≅ (x,(y,z)) .

Left identity ∗ : unit ** v ≅ v

: Right identity: u ** unit ≅ u

Associativity: u ** ( v ** w ) ≅ ( u ** v ) ** w

These turn out to be equivalent to the usual Applicative laws. In a category theory setting, one would also require a naturality law:

∗ Here g *** h = \(x,y) -> (g x, h y) . See Arrows.

Naturality: fmap ( g *** h ) ( u ** v ) = fmap g u ** fmap h v

but in the context of Haskell, this is a free theorem.

Much of this section was taken from a blog post by Edward Z. Yang; see his actual post for a bit more information.

Exercises Implement pure and (<*>) in terms of unit and (**) , and vice versa. Are there any Applicative instances for which there are also functions f () -> () and f (a,b) -> (f a, f b) , satisfying some "reasonable" laws? (Tricky) Prove that given your implementations from the first exercise, the usual Applicative laws and the Monoidal laws stated above are equivalent.

Further reading

McBride and Paterson’s original paper is a treasure-trove of information and examples, as well as some perspectives on the connection between Applicative and category theory. Beginners will find it difficult to make it through the entire paper, but it is extremely well-motivated—even beginners will be able to glean something from reading as far as they are able.

∗ Introduced by an earlier paper that was since superseded by Push-pull functional reactive programming.

Conal Elliott has been one of the biggest proponents of Applicative . For example, the Pan library for functional images and the reactive library for functional reactive programming (FRP) ∗ make key use of it; his blog also contains many examples of Applicative in action. Building on the work of McBride and Paterson, Elliott also built the TypeCompose library, which embodies the observation (among others) that Applicative types are closed under composition; therefore, Applicative instances can often be automatically derived for complex types built out of simpler ones.

Although the Parsec parsing library (paper) was originally designed for use as a monad, in its most common use cases an Applicative instance can be used to great effect; Bryan O’Sullivan’s blog post is a good starting point. If the extra power provided by Monad isn’t needed, it’s usually a good idea to use Applicative instead.

A couple other nice examples of Applicative in action include the ConfigFile and HSQL libraries and the formlets library.

Gershom Bazerman's post contains many insights into applicatives.

The ApplicativeDo extension is described in this wiki page, and in more detail in this Haskell Symposium paper.

Monad

It’s a safe bet that if you’re reading this, you’ve heard of monads—although it’s quite possible you’ve never heard of Applicative before, or Arrow , or even Monoid . Why are monads such a big deal in Haskell? There are several reasons.

Haskell does, in fact, single out monads for special attention by making them the framework in which to construct I/O operations.

Haskell also singles out monads for special attention by providing a special syntactic sugar for monadic expressions: the do -notation. (As of GHC 8, do -notation can be used with Applicative as well, but the notation is still fundamentally related to monads.)

-notation. (As of GHC 8, -notation can be used with as well, but the notation is still fundamentally related to monads.) Monad has been around longer than other abstract models of computation such as Applicative or Arrow .

has been around longer than other abstract models of computation such as or . The more monad tutorials there are, the harder people think monads must be, and the more new monad tutorials are written by people who think they finally “get” monads (the monad tutorial fallacy).

I will let you judge for yourself whether these are good reasons.

In the end, despite all the hoopla, Monad is just another type class. Let’s take a look at its definition.

Definition

As of GHC 7.10, Monad is defined as:

class Applicative m => Monad m where return :: a -> m a ( >>= ) :: m a -> ( a -> m b ) -> m b ( >> ) :: m a -> m b -> m b m >> n = m >>= \ _ -> n fail :: String -> m a

(Prior to GHC 7.10, Applicative was not a superclass of Monad , for historical reasons.)

The Monad type class is exported by the Prelude , along with a few standard instances. However, many utility functions are found in Control.Monad .

Let’s examine the methods in the Monad class one by one. The type of return should look familiar; it’s the same as pure . Indeed, return is pure , but with an unfortunate name. (Unfortunate, since someone coming from an imperative programming background might think that return is like the C or Java keyword of the same name, when in fact the similarities are minimal.) For historical reasons, we still have both names, but they should always denote the same value (although this cannot be enforced). Likewise, (>>) should be the same as (*>) from Applicative . It is possible that return and (>>) may eventually be removed from the Monad class: see the Monad of No Return proposal.

We can see that (>>) is a specialized version of (>>=) , with a default implementation given. It is only included in the type class declaration so that specific instances of Monad can override the default implementation of (>>) with a more efficient one, if desired. Also, note that although _ >> n = n would be a type-correct implementation of (>>) , it would not correspond to the intended semantics: the intention is that m >> n ignores the result of m , but not its effects.

The fail function is an awful hack that has no place in the Monad class; more on this later.

The only really interesting thing to look at—and what makes Monad strictly more powerful than Applicative —is (>>=) , which is often called bind.

We could spend a while talking about the intuition behind (>>=) —and we will. But first, let’s look at some examples.

Instances

Even if you don’t understand the intuition behind the Monad class, you can still create instances of it by just seeing where the types lead you. You may be surprised to find that this actually gets you a long way towards understanding the intuition; at the very least, it will give you some concrete examples to play with as you read more about the Monad class in general. The first few examples are from the standard Prelude ; the remaining examples are from the transformers package.

The simplest possible instance of Monad is Identity , which is described in Dan Piponi’s highly recommended blog post on The Trivial Monad. Despite being “trivial”, it is a great introduction to the Monad type class, and contains some good exercises to get your brain working.

is , which is described in Dan Piponi’s highly recommended blog post on The Trivial Monad. Despite being “trivial”, it is a great introduction to the type class, and contains some good exercises to get your brain working. The next simplest instance of Monad is Maybe . We already know how to write return / pure for Maybe . So how do we write (>>=) ? Well, let’s think about its type. Specializing for Maybe , we have ( >>= ) :: Maybe a -> ( a -> Maybe b ) -> Maybe b . If the first argument to (>>=) is Just x , then we have something of type a (namely, x ), to which we can apply the second argument—resulting in a Maybe b , which is exactly what we wanted. What if the first argument to (>>=) is Nothing ? In that case, we don’t have anything to which we can apply the a -> Maybe b function, so there’s only one thing we can do: yield Nothing . This instance is: instance Monad Maybe where return :: a -> Maybe a return = Just ( >>= ) :: Maybe a -> ( a -> Maybe b ) -> Maybe b ( Just x ) >>= g = g x Nothing >>= _ = Nothing We can already get a bit of intuition as to what is going on here: if we build up a computation by chaining together a bunch of functions with (>>=) , as soon as any one of them fails, the entire computation will fail (because Nothing >>= f is Nothing , no matter what f is). The entire computation succeeds only if all the constituent functions individually succeed. So the Maybe monad models computations which may fail.

is . We already know how to write / for . So how do we write ? Well, let’s think about its type. Specializing for , we have The Monad instance for the list constructor [] is similar to its Applicative instance; see the exercise below.

instance for the list constructor is similar to its instance; see the exercise below. Of course, the IO constructor is famously a Monad , but its implementation is somewhat magical, and may in fact differ from compiler to compiler. It is worth emphasizing that the IO monad is the only monad which is magical. It allows us to build up, in an entirely pure way, values representing possibly effectful computations. The special value main , of type IO () , is taken by the runtime and actually executed, producing actual effects. Every other monad is functionally pure, and requires no special compiler support. We often speak of monadic values as “effectful computations”, but this is because some monads allow us to write code as if it has side effects, when in fact the monad is hiding the plumbing which allows these apparent side effects to be implemented in a functionally pure way.

constructor is famously a , but its implementation is somewhat magical, and may in fact differ from compiler to compiler. It is worth emphasizing that the monad is the monad which is magical. It allows us to build up, in an entirely pure way, values representing possibly effectful computations. The special value , of type , is taken by the runtime and actually executed, producing actual effects. Every other monad is functionally pure, and requires no special compiler support. We often speak of monadic values as “effectful computations”, but this is because some monads allow us to write code it has side effects, when in fact the monad is hiding the plumbing which allows these apparent side effects to be implemented in a functionally pure way. As mentioned earlier, ((->) e) is known as the reader monad , since it describes computations in which a value of type e is available as a read-only environment. The Control.Monad.Reader module provides the Reader e a type, which is just a convenient newtype wrapper around (e -> a) , along with an appropriate Monad instance and some Reader -specific utility functions such as ask (retrieve the environment), asks (retrieve a function of the environment), and local (run a subcomputation under a different environment).

is known as the , since it describes computations in which a value of type is available as a read-only environment. The module provides the type, which is just a convenient wrapper around , along with an appropriate instance and some -specific utility functions such as (retrieve the environment), (retrieve a function of the environment), and (run a subcomputation under a different environment). The Control.Monad.Writer module provides the Writer monad, which allows information to be collected as a computation progresses. Writer w a is isomorphic to (a,w) , where the output value a is carried along with an annotation or “log” of type w , which must be an instance of Monoid (see section Monoid); the special function tell performs logging.

module provides the monad, which allows information to be collected as a computation progresses. is isomorphic to , where the output value is carried along with an annotation or “log” of type , which must be an instance of (see section Monoid); the special function performs logging. The Control.Monad.State module provides the State s a type, a newtype wrapper around s -> (a,s) . Something of type State s a represents a stateful computation which produces an a but can access and modify the state of type s along the way. The module also provides State -specific utility functions such as get (read the current state), gets (read a function of the current state), put (overwrite the state), and modify (apply a function to the state).

module provides the type, a wrapper around . Something of type represents a stateful computation which produces an but can access and modify the state of type along the way. The module also provides -specific utility functions such as (read the current state), (read a function of the current state), (overwrite the state), and (apply a function to the state). The Control.Monad.Cont module provides the Cont monad, which represents computations in continuation-passing style. It can be used to suspend and resume computations, and to implement non-local transfers of control, co-routines, other complex control structures—all in a functionally pure way. Cont has been called the “mother of all monads” because of its universal properties.

Exercises Implement a Monad instance for the list constructor, [] . Follow the types! Implement a Monad instance for ((->) e) . Implement Functor and Monad instances for Free f , defined as data Free f a = Var a | Node ( f ( Free f a )) You may assume that f has a Functor instance. This is known as the free monad built from the functor f .

Intuition

Let’s look more closely at the type of (>>=) . The basic intuition is that it combines two computations into one larger computation. The first argument, m a , is the first computation. However, it would be boring if the second argument were just an m b ; then there would be no way for the computations to interact with one another (actually, this is exactly the situation with Applicative ). So, the second argument to (>>=) has type a -> m b : a function of this type, given a result of the first computation, can produce a second computation to be run. In other words, x >>= k is a computation which runs x , and then uses the result(s) of x to decide what computation to run second, using the output of the second computation as the result of the entire computation.

∗ Actually, because Haskell allows general recursion, one can recursively construct infinite grammars, and hence Applicative (together with Alternative ) is enough to parse any context-sensitive language with a finite alphabet. See Parsing context-sensitive languages with Applicative. Intuitively, it is this ability to use the output from previous computations to decide what computations to run next that makes Monad more powerful than Applicative . The structure of an Applicative computation is fixed, whereas the structure of a Monad computation can change based on intermediate results. This also means that parsers built using an Applicative interface can only parse context-free languages; in order to parse context-sensitive languages a Monad interface is needed.∗

To see the increased power of Monad from a different point of view, let’s see what happens if we try to implement (>>=) in terms of fmap , pure , and (<*>) . We are given a value x of type m a , and a function k of type a -> m b , so the only thing we can do is apply k to x . We can’t apply it directly, of course; we have to use fmap to lift it over the m . But what is the type of fmap k ? Well, it’s m a -> m (m b) . So after we apply it to x , we are left with something of type m (m b) —but now we are stuck; what we really want is an m b , but there’s no way to get there from here. We can add m ’s using pure , but we have no way to collapse multiple m ’s into one.

∗ You might hear some people claim that the definition in terms of return , fmap , and join is the “math definition” and the definition in terms of return and (>>=) is something specific to Haskell. In fact, both definitions were known in the mathematics community long before Haskell picked up monads.

This ability to collapse multiple m ’s is exactly the ability provided by the function join :: m (m a) -> m a , and it should come as no surprise that an alternative definition of Monad can be given in terms of join :

class Applicative m => Monad'' m where join :: m ( m a ) -> m a

In fact, the canonical definition of monads in category theory is in terms of return , fmap , and join (often called , , and in the mathematical literature). Haskell uses an alternative formulation with (>>=) instead of join since it is more convenient to use ∗. However, sometimes it can be easier to think about Monad instances in terms of join , since it is a more “atomic” operation. (For example, join for the list monad is just concat .)

Exercises Implement (>>=) in terms of fmap (or liftM ) and join . Now implement join and fmap ( liftM ) in terms of (>>=) and return .

Utility functions

The Control.Monad module provides a large number of convenient utility functions, all of which can be implemented in terms of the basic Monad operations ( return and (>>=) in particular). We have already seen one of them, namely, join . We also mention some other noteworthy ones here; implementing these utility functions oneself is a good exercise. For a more detailed guide to these functions, with commentary and example code, see Henk-Jan van Tuyl’s tour.

liftM :: Monad m => (a -> b) -> m a -> m b . This should be familiar; of course, it is just fmap . The fact that we have both fmap and liftM is a consequence of the fact that the Monad type class did not require a Functor instance until recently, even though mathematically speaking, every monad is a functor. If you are using GHC 7.10 or newer, you should avoid using liftM and just use fmap instead.

ap :: Monad m => m (a -> b) -> m a -> m b should also be familiar: it is equivalent to (<*>) , justifying the claim that the Monad interface is strictly more powerful than Applicative . We can make any Monad into an instance of Applicative by setting pure = return and (<*>) = ap .

sequence :: Monad m => [m a] -> m [a] takes a list of computations and combines them into one computation which collects a list of their results. It is again something of a historical accident that sequence has a Monad constraint, since it can actually be implemented only in terms of Applicative (see the exercise at the end of the Utility Functions section for Applicative). Note that the actual type of sequence is more general, and works over any Traversable rather than just lists; see the section on Traversable .

replicateM :: Monad m => Int -> m a -> m [a] is simply a combination of replicate and sequence .

mapM :: Monad m => (a -> m b) -> [a] -> m [b] maps its first argument over the second, and sequence s the results. The forM function is just mapM with its arguments reversed; it is called forM since it models generalized for loops: the list [a] provides the loop indices, and the function a -> m b specifies the “body” of the loop for each index. Again, these functions actually work over any Traversable , not just lists, and they can also be defined in terms of Applicative , not Monad : the analogue of mapM for Applicative is called traverse .

(=<<) :: Monad m => (a -> m b) -> m a -> m b is just (>>=) with its arguments reversed; sometimes this direction is more convenient since it corresponds more closely to function application.

(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c is sort of like function composition, but with an extra m on the result type of each function, and the arguments swapped. We’ll have more to say about this operation later. There is also a flipped variant, (<=<) .

Many of these functions also have “underscored” variants, such as sequence_ and mapM_ ; these variants throw away the results of the computations passed to them as arguments, using them only for their side effects.

Other monadic functions which are occasionally useful include filterM , zipWithM , foldM , and forever .

Laws

There are several laws that instances of Monad should satisfy (see also the Monad laws wiki page). The standard presentation is:

return a >>= k = k a m >>= return = m m >>= ( \ x -> k x >>= h ) = ( m >>= k ) >>= h

The first and second laws express the fact that return behaves nicely: if we inject a value a into a monadic context with return , and then bind to k , it is the same as just applying k to a in the first place; if we bind a computation m to return , nothing changes. The third law essentially says that (>>=) is associative, sort of.

∗ I like to pronounce this operator “fish”.

However, the presentation of the above laws, especially the third, is marred by the asymmetry of (>>=) . It’s hard to look at the laws and see what they’re really saying. I prefer a much more elegant version of the laws, which is formulated in terms of (>=>) ∗. Recall that (>=>) “composes” two functions of type a -> m b and b -> m c . You can think of something of type a -> m b (roughly) as a function from a to b which may also have some sort of effect in the context corresponding to m . (>=>) lets us compose these “effectful functions”, and we would like to know what properties (>=>) has. The monad laws reformulated in terms of (>=>) are:

return >=> g = g g >=> return = g ( g >=> h ) >=> k = g >=> ( h >=> k )

∗ As fans of category theory will note, these laws say precisely that functions of type a -> m b are the arrows of a category with (>=>) as composition! Indeed, this is known as the Kleisli category of the monad m . It will come up again when we discuss Arrow s.

Ah, much better! The laws simply state that return is the identity of (>=>) , and that (>=>) is associative ∗.

There is also a formulation of the monad laws in terms of fmap , return , and join ; for a discussion of this formulation, see the Haskell wikibook page on category theory.

Exercises Given the definition g >=> h = \x -> g x >>= h , prove the equivalence of the above laws and the usual monad laws.

do notation

Haskell’s special do notation supports an “imperative style” of programming by providing syntactic sugar for chains of monadic expressions. The genesis of the notation lies in realizing that something like a >>= \x -> b >> c >>= \y -> d can be more readably written by putting successive computations on separate lines:

a >>= \ x -> b >> c >>= \ y -> d

This emphasizes that the overall computation consists of four computations a , b , c , and d , and that x is bound to the result of a , and y is bound to the result of c ( b , c , and d are allowed to refer to x , and d is allowed to refer to y as well). From here it is not hard to imagine a nicer notation:

do { x <- a ; b ; y <- c ; d }

(The curly braces and semicolons may optionally be omitted; the Haskell parser uses layout to determine where they should be inserted.) This discussion should make clear that do notation is just syntactic sugar. In fact, do blocks are recursively translated into monad operations (almost) like this:

do e → e do { e; stmts } → e >> do { stmts } do { v <- e; stmts } → e >>= \v -> do { stmts } do { let decls; stmts} → let decls in do { stmts }

This is not quite the whole story, since v might be a pattern instead of a variable. For example, one can write

do ( x : xs ) <- foo bar x

but what happens if foo is an empty list? Well, remember that ugly fail function in the Monad type class declaration? That’s what happens. See section 3.14 of the Haskell Report for the full details. See also the discussion of MonadPlus and MonadZero in the section on other monoidal classes.

A final note on intuition: do notation plays very strongly to the “computational context” point of view rather than the “container” point of view, since the binding notation x <- m is suggestive of “extracting” a single x from m and doing something with it. But m may represent some sort of a container, such as a list or a tree; the meaning of x <- m is entirely dependent on the implementation of (>>=) . For example, if m is a list, x <- m actually means that x will take on each value from the list in turn.

Sometimes, the full power of Monad is not needed to desugar do -notation. For example,

do x <- foo1 y <- foo2 z <- foo3 return ( g x y z )

would normally be desugared to foo1 >>= \x -> foo2 >>= \y -> foo3 >>= \z -> return (g x y z) , but this is equivalent to g <$> foo1 <*> foo2 <*> foo3 . With the ApplicativeDo extension enabled (as of GHC 8.0), GHC tries hard to desugar do -blocks using Applicative operations wherever possible. This can sometimes lead to efficiency gains, even for types which also have Monad instances, since in general Applicative computations may be run in parallel, whereas monadic ones may not. For example, consider

g :: Int -> Int -> M Int -- These could be expensive bar , baz :: M Int foo :: M Int foo = do x <- bar y <- baz g x y

foo definitely depends on the Monad instance of M , since the effects generated by the whole computation may depend (via g ) on the Int outputs of bar and baz . Nonetheless, with ApplicativeDo enabled, foo can be desugared as

join ( g <$> bar <*> baz )

which may allow bar and baz to be computed in parallel, since they at least do not depend on each other.

The ApplicativeDo extension is described in this wiki page, and in more detail in this Haskell Symposium paper.

Further reading

Philip Wadler was the first to propose using monads to structure functional programs. His paper is still a readable introduction to the subject.

∗ All About Monads, Monads as containers, Understanding monads, The Monadic Way, You Could Have Invented Monads! (And Maybe You Already Have.), there’s a monster in my Haskell!, Understanding Monads. For real., Monads in 15 minutes: Backtracking and Maybe, Monads as computation, Practical Monads

There are, of course, numerous monad tutorials of varying quality ∗.

A few of the best include Cale Gibbard’s Monads as containers and Monads as computation; Jeff Newbern’s All About Monads, a comprehensive guide with lots of examples; and Dan Piponi’s You Could Have Invented Monads!, which features great exercises. If you just want to know how to use IO , you could consult the Introduction to IO. Even this is just a sampling; the monad tutorials timeline is a more complete list. (All these monad tutorials have prompted parodies like think of a monad ... as well as other kinds of backlash like Monads! (and Why Monad Tutorials Are All Awful) or Abstraction, intuition, and the “monad tutorial fallacy”.)

Other good monad references which are not necessarily tutorials include Henk-Jan van Tuyl’s tour of the functions in Control.Monad , Dan Piponi’s field guide, Tim Newsham’s What’s a Monad?, and Chris Smith's excellent article Why Do Monads Matter?. There are also many blog posts which have been written on various aspects of monads; a collection of links can be found under Blog articles/Monads.

For help constructing monads from scratch, and for obtaining a "deep embedding" of monad operations suitable for use in, say, compiling a domain-specific language, see Apfelmus's operational package.

One of the quirks of the Monad class and the Haskell type system is that it is not possible to straightforwardly declare Monad instances for types which require a class constraint on their data, even if they are monads from a mathematical point of view. For example, Data.Set requires an Ord constraint on its data, so it cannot be easily made an instance of Monad . A solution to this problem was first described by Eric Kidd, and later made into a library named rmonad by Ganesh Sittampalam and Peter Gavin.

There are many good reasons for eschewing do notation; some have gone so far as to consider it harmful.

Monads can be generalized in various ways; for an exposition of one possibility, see Robert Atkey’s paper on parameterized monads, or Dan Piponi’s Beyond Monads.

For the categorically inclined, monads can be viewed as monoids (From Monoids to Monads) and also as closure operators (Triples and Closure). Derek Elkins’ article in issue 13 of the Monad.Reader contains an exposition of the category-theoretic underpinnings of some of the standard Monad instances, such as State and Cont . Jonathan Hill and Keith Clarke have an early paper explaining the connection between monads as they arise in category theory and as used in functional programming. There is also a web page by Oleg Kiselyov explaining the history of the IO monad.

Links to many more research papers related to monads can be found under Research papers/Monads and arrows.

MonadFail

Some monads support a notion of failure, without necessarily supporting the notion of recovery suggested by MonadPlus , and possibly including a primitive error reporting mechanism. This notion is expressed by the relatively unprincipled MonadFail . When the MonadFailDesugaring language extension is enabled, the fail method from MonadFail is used for pattern match failure in do bindings rather than the traditional fail method of the Monad class. This language change is being implemented because there are many monads, such as Reader , State , Writer , RWST , and Cont that simply do not support a legitimate fail method.

See the MonadFail proposal for more information.

Definition

class Monad m => MonadFail m where fail :: String -> m a

Law

fail s >>= m = fail s

Monad transformers

One would often like to be able to combine two monads into one: for example, to have stateful, nondeterministic computations ( State + [] ), or computations which may fail and can consult a read-only environment ( Maybe + Reader ), and so on. Unfortunately, monads do not compose as nicely as applicative functors (yet another reason to use Applicative if you don’t need the full power that Monad provides), but some monads can be combined in certain ways.

Standard monad transformers

The transformers library provides a number of standard monad transformers. Each monad transformer adds a particular capability/feature/effect to any existing monad.

IdentityT is the identity transformer, which maps a monad to (something isomorphic to) itself. This may seem useless at first glance, but it is useful for the same reason that the id function is useful -- it can be passed as an argument to things which are parameterized over an arbitrary monad transformer, when you do not actually want any extra capabilities.

is the identity transformer, which maps a monad to (something isomorphic to) itself. This may seem useless at first glance, but it is useful for the same reason that the function is useful -- it can be passed as an argument to things which are parameterized over an arbitrary monad transformer, when you do not actually want any extra capabilities. StateT adds a read-write state.

adds a read-write state. ReaderT adds a read-only environment.

adds a read-only environment. WriterT adds a write-only log.

adds a write-only log. RWST conveniently combines ReaderT , WriterT , and StateT into one.

conveniently combines , , and into one. MaybeT adds the possibility of failure.

adds the possibility of failure. ErrorT adds the possibility of failure with an arbitrary type to represent errors.

adds the possibility of failure with an arbitrary type to represent errors. ListT adds non-determinism (however, see the discussion of ListT below).

adds non-determinism (however, see the discussion of below). ContT adds continuation handling.

For example, StateT s Maybe is an instance of Monad ; computations of type StateT s Maybe a may fail, and have access to a mutable state of type s . Monad transformers can be multiply stacked. One thing to keep in mind while using monad transformers is that the order of composition matters. For example, when a StateT s Maybe a computation fails, the state ceases being updated (indeed, it simply disappears); on the other hand, the state of a MaybeT (State s) a computation may continue to be modified even after the computation has "failed". This may seem backwards, but it is correct. Monad transformers build composite monads “inside out”; MaybeT (State s) a is isomorphic to s -> (Maybe a, s) . (Lambdabot has an indispensable @unmtl command which you can use to “unpack” a monad transformer stack in this way.) Intuitively, the monads become "more fundamental" the further inside the stack you get, and the effects of inner monads "have precedence" over the effects of outer ones. Of course, this is just handwaving, and if you are unsure of the proper order for some monads you wish to combine, there is no substitute for using @unmtl or simply trying out the various options.

Definition and laws

All monad transformers should implement the MonadTrans type class, defined in Control.Monad.Trans.Class :

class MonadTrans t where lift :: Monad m => m a -> t m a

It allows arbitrary computations in the base monad m to be “lifted” into computations in the transformed monad t m . (Note that type application associates to the left, just like function application, so t m a = (t m) a .)

lift must satisfy the laws

lift . return = return lift ( m >>= f ) = lift m >>= ( lift . f )

which intuitively state that lift transforms m a computations into t m a computations in a "sensible" way, which sends the return and (>>=) of m to the return and (>>=) of t m .

Exercises What is the kind of t in the declaration of MonadTrans ?

Transformer type classes and "capability" style

∗ The only problem with this scheme is the quadratic number of instances required as the number of standard monad transformers grows—but as the current set of standard monad transformers seems adequate for most common use cases, this may not be that big of a deal.

There are also type classes (provided by the mtl package) for the operations of each transformer. For example, the MonadState type class provides the state-specific methods get and put , allowing you to conveniently use these methods not only with State , but with any monad which is an instance of MonadState —including MaybeT (State s) , StateT s (ReaderT r IO) , and so on. Similar type classes exist for Reader , Writer , Cont , IO , and others ∗.

These type classes serve two purposes. First, they get rid of (most of) the need for explicitly using lift , giving a type-directed way to automatically determine the right number of calls to lift . Simply writing put will be automatically translated into lift . put , lift . lift . put , or something similar depending on what concrete monad stack you are using.

Second, they give you more flexibility to switch between different concrete monad stacks. For example, if you are writing a state-based algorithm, don't write

foo :: State Int Char foo = modify ( * 2 ) >> return 'x'

but rather

foo :: MonadState Int m => m Char foo = modify ( * 2 ) >> return 'x'

Now, if somewhere down the line you realize you need to introduce the possibility of failure, you might switch from State Int to MaybeT (State Int) . The type of the first version of foo would need to be modified to reflect this change, but the second version of foo can still be used as-is.

However, this sort of "capability-based" style (e.g. specifying that foo works for any monad with the "state capability") quickly runs into problems when you try to naively scale it up: for example, what if you need to maintain two independent states? A framework for solving this and related problems is described by Schrijvers and Olivera (Monads, zippers and views: virtualizing the monad stack, ICFP 2011) and is implemented in the Monatron package.

Composing monads

Is the composition of two monads always a monad? As hinted previously, the answer is no.

Since Applicative functors are closed under composition, the problem must lie with join . Indeed, suppose m and n are arbitrary monads; to make a monad out of their composition we would need to be able to implement

join :: m ( n ( m ( n a ))) -> m ( n a )

but it is not clear how this could be done in general. The join method for m is no help, because the two occurrences of m are not next to each other (and likewise for n ).

However, one situation in which it can be done is if n distributes over m , that is, if there is a function

distrib :: n ( m a ) -> m ( n a )

satisfying certain laws. See Jones and Duponcheel (Composing Monads); see also the section on Traversable.

For a much more in-depth discussion and analysis of the failure of monads to be closed under composition, see this question on StackOverflow.

Exercises Implement join :: M (N (M (N a))) -> M (N a) , given distrib :: N (M a) -> M (N a) and assuming M and N are instances of Monad .

Further reading

Much of the monad transformer library (originally mtl , now split between mtl and transformers ), including the Reader , Writer , State , and other monads, as well as the monad transformer framework itself, was inspired by Mark Jones’ classic paper Functional Programming with Overloading and Higher-Order Polymorphism. It’s still very much worth a read—and highly readable—after almost fifteen years.

See Edward Kmett's mailing list message for a description of the history and relationships among monad transformer packages ( mtl , transformers , monads-fd , monads-tf ).

There are two excellent references on monad transformers. Martin Grabmüller’s Monad Transformers Step by Step is a thorough description, with running examples, of how to use monad transformers to elegantly build up computations with various effects. Cale Gibbard’s article on how to use monad transformers is more practical, describing how to structure code using monad transformers to make writing it as painless as possible. Another good starting place for learning about monad transformers is a blog post by Dan Piponi.

The ListT transformer from the transformers package comes with the caveat that ListT m is only a monad when m is commutative, that is, when ma >>= \a -> mb >>= \b -> foo is equivalent to mb >>= \b -> ma >>= \a -> foo (i.e. the order of m 's effects does not matter). For one explanation why, see Dan Piponi's blog post "Why isn't ListT [] a monad". For more examples, as well as a design for a version of ListT which does not have this problem, see ListT done right.

There is an alternative way to compose monads, using coproducts, as described by Lüth and Ghani. This method is interesting but has not (yet?) seen widespread use. For a more recent alternative, see Kiselyov et al's Extensible Effects: An Alternative to Monad Transformers.

MonadFix

Note: MonadFix is included here for completeness (and because it is interesting) but seems not to be used much. Skipping this section on a first read-through is perfectly OK (and perhaps even recommended).

do rec notation

The MonadFix class describes monads which support the special fixpoint operation mfix :: (a -> m a) -> m a , which allows the output of monadic computations to be defined via (effectful) recursion. This is supported in GHC by a special “recursive do” notation, enabled by the -XRecursiveDo flag. Within a do block, one may have a nested rec block, like so:

do { x <- foo ; rec { y <- baz ; z <- bar ; bob } ; w <- frob }

Normally (if we had do in place of rec in the above example), y would be in scope in bar and bob but not in baz , and z would be in scope only in bob . With the rec , however, y and z are both in scope in all three of baz , bar , and bob . A rec block is analogous to a let block such as

let { y = baz ; z = bar } in bob

because, in Haskell, every variable bound in a let -block is in scope throughout the entire block. (From this point of view, Haskell's normal do blocks are analogous to Scheme's let* construct.)

What could such a feature be used for? One of the motivating examples given in the original paper describing MonadFix (see below) is encoding circuit descriptions. A line in a do -block such as

x <- gate y z

describes a gate whose input wires are labeled y and z and whose output wire is labeled x . Many (most?) useful circuits, however, involve some sort of feedback loop, making them impossible to write in a normal do -block (since some wire would have to be mentioned as an input before being listed as an output). Using a rec block solves this problem.

Examples and intuition

Of course, not every monad supports such recursive binding. However, as mentioned above, it suffices to have an implementation of mfix :: (a -> m a) -> m a , satisfying a few laws. Let's try implementing mfix for the Maybe monad. That is, we want to implement a function

maybeFix :: ( a -> Maybe a ) -> Maybe a

∗ Actually, fix is implemented slightly differently for efficiency reasons; but the given definition is equivalent and simpler for the present purpose. Let's think for a moment about the implementation ∗ of the non-monadic fix :: (a -> a) -> a :

fix f = f ( fix f )

Inspired by fix , our first attempt at implementing maybeFix might be something like

maybeFix :: ( a -> Maybe a ) -> Maybe a maybeFix f = maybeFix f >>= f

This has the right type. However, something seems wrong: there is nothing in particular here about Maybe ; maybeFix actually has the more general type Monad m => (a -> m a) -> m a . But didn't we just say that not all monads support mfix ?

The answer is that although this implementation of maybeFix has the right type, it does not have the intended semantics. If we think about how (>>=) works for the Maybe monad (by pattern-matching on its first argument to see whether it is Nothing or Just ) we can see that this definition of maybeFix is completely useless: it will just recurse infinitely, trying to decide whether it is going to return Nothing or Just , without ever even so much as a glance in the direction of f .

The trick is to simply assume that maybeFix will return Just , and get on with life!

maybeFix :: ( a -> Maybe a ) -> Maybe a maybeFix f = ma where ma = f ( fromJust ma )

This says that the result of maybeFix is ma , and assuming that ma = Just x , it is defined (recursively) to be equal to f x .

Why is this OK? Isn't fromJust almost as bad as unsafePerformIO ? Well, usually, yes. This is just about the only situation in which it is justified! The interesting thing to note is that maybeFix will never crash -- although it may, of course, fail to terminate. The only way we could get a crash is if we try to evaluate fromJust ma when we know that ma = Nothing . But how could we know ma = Nothing ? Since ma is defined as f (fromJust ma) , it must be that this expression has already been evaluated to Nothing -- in which case there is no reason for us to be evaluating fromJust ma in the first place!

To see this from another point of view, we can consider three possibilities. First, if f outputs Nothing without looking at its argument, then maybeFix f clearly returns Nothing . Second, if f always outputs Just x , where x depends on its argument, then the recursion can proceed usefully: fromJust ma will be able to evaluate to x , thus feeding f 's output back to it as input. Third, if f tries to use its argument to decide whether to output Just or Nothing , then maybeFix f will not terminate: evaluating f 's argument requires evaluating ma to see whether it is Just , which requires evaluating f (fromJust ma) , which requires evaluating ma , ... and so on.

There are also instances of MonadFix for lists (which works analogously to the instance for Maybe ), for ST , and for IO . The instance for IO is particularly amusing: it creates a new (empty) MVar , immediately reads its contents using unsafeInterleaveIO (which delays the actual reading lazily until the value is needed), uses the contents of the MVar to compute a new value, which it then writes back into the MVar . It almost seems, spookily, that mfix is sending a value back in time to itself through the MVar -- though of course what is really going on is that the reading is delayed just long enough (via unsafeInterleaveIO ) to get the process bootstrapped.

Exercises Implement a MonadFix instance for [] .

mdo syntax

The example at the start of this section can also be written

mdo { x <- foo ; y <- baz ; z <- bar ; bob ; w <- frob }

which will be translated into the original example (assuming that, say, bar and bob refer to y . The difference is that mdo will analyze the code in order to find minimal recursive blocks, which will be placed in rec blocks, whereas rec blocks desugar directly into calls to mfix without any further analysis.

Further reading

For more information (such as the precise desugaring rules for rec blocks), see Levent Erkök and John Launchbury's 2002 Haskell workshop paper, A Recursive do for Haskell, or for full details, Levent Erkök’s thesis, Value Recursion in Monadic Computations. (Note, while reading, that MonadFix used to be called MonadRec .) You can also read the GHC user manual section on recursive do-notation.

Semigroup

A semigroup is a set together with a binary operation which combines elements from . The operator is required to be associative (that is, , for any which are elements of ).

For example, the natural numbers under addition form a semigroup: the sum of any two natural numbers is a natural number, and for any natural numbers , , and . The integers under multiplication also form a semigroup, as do the integers (or rationals, or reals) under or , Boolean values under conjunction and disjunction, lists under concatenation, functions from a set to itself under composition ... Semigroups show up all over the place, once you know to look for them.

Definition

As of version 4.9 of the base package (which comes with GHC 8.0), semigroups are defined in the Data.Semigroup module. (If you are working with a previous version of base, or want to write a library that supports previous versions of base, you can use the semigroups package.)

The definition of the Semigroup type class (haddock) is as follows:

class Semigroup a where ( <> ) :: a -> a -> a sconcat :: NonEmpty a -> a sconcat ( a :| as ) = go a as where go b ( c : cs ) = b <> go c cs go b [] = b stimes :: Integral b => b -> a -> a stimes = ...

The really important method is (<>) , representing the associative binary operation. The other two methods have default implementations in terms of (<>) , and are included in the type class in case some instances can give more efficient implementations than the default.

sconcat reduces a nonempty list using (<>) . For most instances, this is the same as foldr1 (<>) , but it can be constant-time for idempotent semigroups.

stimes n is equivalent to (but sometimes considerably more efficient than) sconcat . replicate n . Its default definition uses multiplication by doubling (also known as exponentiation by squaring). For many semigroups, this is an important optimization; for some, such as lists, it is terrible and must be overridden.

See the haddock documentation for more information on sconcat and stimes .

Laws

The only law is that (<>) must be associative:

( x <> y ) <> z = x <> ( y <> z )

Monoid

Many semigroups have a special element for which the binary operation is the identity, that is, for every element . Such a semigroup-with-identity-element is called a monoid.

Definition

The definition of the Monoid type class (defined in Data.Monoid ; haddock) is:

class Monoid a where mempty :: a mappend :: a -> a -> a mconcat :: [ a ] -> a mconcat = foldr mappend mempty

The mempty value specifies the identity element of the monoid, and mappend is the binary operation. The default definition for mconcat “reduces” a list of elements by combining them all with mappend , using a right fold. It is only in the Monoid class so that specific instances have the option of providing an alternative, more efficient implementation; usually, you can safely ignore mconcat when creating a Monoid instance, since its default definition will work just fine.

The Monoid methods are rather unfortunately named; they are inspired by the list instance of Monoid , where indeed mempty = [] and mappend = (++) , but this is misleading since many monoids have little to do with appending (see these Comments from OCaml Hacker Brian Hurt on the Haskell-cafe mailing list). The situation is made somewhat better by (<>) , which is provided as an alias for mappend .

Note that the (<>) alias for mappend conflicts with the Semigroup method of the same name. For this reason, Data.Semigroup re-exports much of Data.Monoid ; to use semigroups and monoids together, just import Data.Semigroup , and make sure all your types have both Semigroup and Monoid instances (and that (<>) = mappend ).

Laws

Of course, every Monoid instance should actually be a monoid in the mathematical sense, which implies these laws:

mempty ` mappend ` x = x x ` mappend ` mempty = x ( x ` mappend ` y ) ` mappend ` z = x ` mappend ` ( y ` mappend ` z )

Instances

There are quite a few interesting Monoid instances defined in Data.Monoid .

[a] is a Monoid , with mempty = [] and mappend = (++) . It is not hard to check that (x ++ y) ++ z = x ++ (y ++ z) for any lists x , y , and z , and that the empty list is the identity: [] ++ x = x ++ [] = x .

is a , with and . It is not hard to check that for any lists , , and , and that the empty list is the identity: . As noted previously, we can make a monoid out of any numeric type under either addition or multiplication. However, since we can’t have two instances for the same type, Data.Monoid provides two newtype wrappers, Sum and Product , with appropriate Monoid instances. > getSum ( mconcat . map Sum $ [ 1 .. 5 ]) 15 > getProduct ( mconcat . map Product $ [ 1 .. 5 ]) 120 This example code is silly, of course; we could just write sum [1..5] and product [1..5] . Nevertheless, these instances are useful in more generalized settings, as we will see in the section on Foldable .

provides two wrappers, and , with appropriate instances. and . Nevertheless, these instances are useful in more generalized settings, as we will see in the section on . Any and All are newtype wrappers providing Monoid instances for Bool (under disjunction and conjunction, respectively).

and are wrappers providing instances for (under disjunction and conjunction, respectively). There are three instances for Maybe : a basic instance which lifts a Monoid instance for a to an instance for Maybe a , and two newtype wrappers First and Last for which mappend selects the first (respectively last) non- Nothing item.

: a basic instance which lifts a instance for to an instance for , and two wrappers and for which selects the first (respectively last) non- item. Endo a is a newtype wrapper for functions a -> a , which form a monoid under composition.

is a newtype wrapper for functions , which form a monoid under composition. There are several ways to “lift” Monoid instances to instances with additional structure. We have already seen that an instance for a can be lifted to an instance for Maybe a . There are also tuple instances: if a and b are instances of Monoid , then so is (a,b) , using the monoid operations for a and b in the obvious pairwise manner. Finally, if a is a Monoid , then so is the function type e -> a for any e ; in particular, g `mappend` h is the function which applies both g and h to its argument and then combines the results using the underlying Monoid instance for a . This can be quite useful and elegant (see example).

instances to instances with additional structure. We have already seen that an instance for can be lifted to an instance for . There are also tuple instances: if and are instances of , then so is , using the monoid operations for and in the obvious pairwise manner. Finally, if is a , then so is the function type for any ; in particular, is the function which applies both and to its argument and then combines the results using the underlying instance for . This can be quite useful and elegant (see example). The type Ordering = LT | EQ | GT is a Monoid , defined in such a way that mconcat (zipWith compare xs ys) computes the lexicographic ordering of xs and ys (if xs and ys have the same length). In particular, mempty = EQ , and mappend evaluates to its leftmost non- EQ argument (or EQ if both arguments are EQ ). This can be used together with the function instance of Monoid to do some clever things (example).

is a , defined in such a way that computes the lexicographic ordering of and (if and have the same length). In particular, , and evaluates to its leftmost non- argument (or if both arguments are ). This can be used together with the function instance of to do some clever things (example). There are also Monoid instances for several standard data structures in the containers library (haddock), including Map , Set , and Sequence .

Monoid is also used to enable several other type class instances. As noted previously, we can use Monoid to make ((,) e) an instance of Applicative :

instance Monoid e => Applicative ((,) e ) where pure :: Monoid e => a -> ( e , a ) pure x = ( mempty , x ) ( <*> ) :: Monoid e => ( e , a -> b ) -> ( e , a ) -> ( e , b ) ( u , f ) <*> ( v , x ) = ( u ` mappend ` v , f x )

Monoid can be similarly used to make ((,) e) an instance of Monad as well; this is known as the writer monad. As we’ve already seen, Writer and WriterT are a newtype wrapper and transformer for this monad, respectively.

Monoid also plays a key role in the Foldable type class (see section Foldable).

Further reading

Monoids got a fair bit of attention in 2009, when a blog post by Brian Hurt complained about the fact that the names of many Haskell type classes ( Monoid in particular) are taken from abstract mathematics. This resulted in a long Haskell-cafe thread arguing the point and discussing monoids in general.

∗ May its name live forever.

However, this was quickly followed by several blog posts about Monoid ∗. First, Dan Piponi wrote a great introductory post, Haskell Monoids and their Uses. This was quickly followed by Heinrich Apfelmus’ Monoids and Finger Trees, an accessible exposition of Hinze and Paterson’s classic paper on 2-3 finger trees, which makes very clever use of Monoid to implement an elegant and generic data structure. Dan Piponi then wrote two fascinating articles about using Monoids (and finger trees): Fast Incremental Regular Expressions and Beyond Regular Expressions

In a similar vein, David Place’s article on improving Data.Map in order to compute incremental folds (see the Monad Reader issue 11) is also a good example of using Monoid to generalize a data structure.

Some other interesting examples of Monoid use include building elegant list sorting combinators, collecting unstructured information, combining probability distributions, and a brilliant series of posts by Chung-Chieh Shan and Dylan Thurston using Monoid s to elegantly solve a difficult combinatorial puzzle (followed by part 2, part 3, part 4).

As unlikely as it sounds, monads can actually be viewed as a sort of monoid, with join playing the role of the binary operation and return the role of the identity; see Dan Piponi’s blog post.

Failure and choice: Alternative, MonadPlus, ArrowPlus

Several classes ( Applicative , Monad , Arrow ) have "monoidal" subclasses, intended to model computations that support "failure" and "choice" (in some appropriate sense).

Definition

The Alternative type class (haddock) is for Applicative functors which also have a monoid structure:

class Applicative f => Alternative f where empty :: f a ( <|> ) :: f a -> f a -> f a some :: f a -> f [ a ] many :: f a -> f [ a ]

The basic intuition is that empty represents some sort of "failure", and (<|>) represents a choice between alternatives. (However, this intuition does not fully capture the nuance possible; see the section on Laws below.) Of course, (<|>) should be associative and empty should be the identity element for it. Instances of Alternative must implement empty and (<|>) ; some and many have default implementations but are included in the class since specialized implementations may be more efficient than the default.

The default definitions of some and many are essentially given by

some v = ( : ) <$> v <*> many v many v = some v <|> pure []

(though for some reason, in actual fact they are not defined via mutual recursion). The intuition is that both keep running v , collecting its results into a list, until it fails; some v requires v to succeed at least once, whereas many v does not require it to succeed at all. That is, many represents 0 or more repetitions of v , whereas some represents 1 or more repetitions. Note that some and many do not make sense for all instances of Alternative ; they are discussed further below.

Likewise, MonadPlus (haddock) is for Monad s with a monoid structure:

class Monad m => MonadPlus m where mzero :: m a mplus :: m a -> m a -> m a

Finally, ArrowZero and ArrowPlus (haddock) represent Arrow s (see below) with a monoid structure:

class Arrow arr => ArrowZero arr where zeroArrow :: b ` arr ` c class ArrowZero arr => ArrowPlus arr where ( <+> ) :: ( b ` arr ` c ) -> ( b ` arr ` c ) -> ( b ` arr ` c )

Instances

Although this document typically discusses laws before presenting example instances, for Alternative and friends it is worth doing things the other way around, because there is some controversy over the laws and it helps to have some concrete examples in mind when discussing them. We mostly focus on Alternative in this section and the next; now that Applicative is a superclass of Monad , there is little reason to use MonadPlus any longer, and ArrowPlus is rather obscure.

Maybe is an instance of Alternative , where empty is Nothing and the choice operator (<|>) results in its first argument when it is Just , and otherwise results in its second argument. Hence folding over a list of Maybe with (<|>) (which can be done with asum from Data.Foldable ) results in the first non- Nothing value in the list (or Nothing if there are none).

[] is an instance, with empty given by the empty list, and (<|>) equal to (++) . It is worth pointing out that this is identical to the Monoid instance for [a] , whereas the Alternative and Monoid instances for Maybe are different: the Monoid instance for Maybe a requires a Monoid instance for a , and monoidally combines the contained values when presented with two Just s.

Let's think about the behavior of some and many for Maybe and [] . For Maybe , we have some Nothing = (:) <$> Nothing <*> many Nothing = Nothing <*> many Nothing = Nothing . Hence we also have many Nothing = some Nothing <|> pure [] = Nothing <|> pure [] = pure [] = Just [] . Boring. But what about applying some and many to Just ? In fact, some (Just a) and many (Just a) are both bottom! The problem is that since Just a is always "successful", the recursion will never terminate. In theory the result "should be" the infinite list [a,a,a,...] but it cannot even start producing any elements of this list, because there is no way for the (<*>) operator to yield any output until it knows that the result of the call to many will be Just .

You can work out the behavior for [] yourself, but it ends up being quite similar: some and many yield boring results when applied to the empty list, and yield bottom when applied to any non-empty list.

In the end, some and many really only make sense when used with some sort of "stateful" Applicative instance, for which an action v , when run multiple times, can succeed some finite number of times and then fail. For example, parsers have this behavior, and indeed, parsers were the original motivating example for the some and many methods; more on this below.

Since GHC 8.0 (that is, base-4.9 ), there is an instance of Alternative for IO . empty throws an I/O exception, and (<|>) works by first running its left-hand argument; if the left-hand argument throws an I/O exception, (<|>) catches the exception and then calls its second argument. (Note that other types of exceptions are not caught.) There are other, much better ways to handle I/O errors, but this is a quick and dirty way that may work for simple, one-off programs, such as expressions typed at the GHCi prompt. For example, if you want to read the contents of a file but use some default contents in case the file does not exist, you can just write readFile "somefile.txt" <|> return "default file contents" .

Concurrently from the async package has an Alternative instance, for which c1 <|> c2 races c1 and c2 in parallel, and returns the result of whichever finishes first. empty corresponds to the action that runs forever without returning a value.

Practically any parser type (e.g. from parsec , megaparsec , trifecta , ...) has an Alternative instance, where empty is an unconditional parse failure, and (<|>) is left-biased choice. That is, p1 <|> p2 first tries parsing with p1 , and if p1 fails then it tries p2 instead.

some and many work particularly well with parser types having an Applicative instance: if p is a parser, then some p parses one or more consecutive occurrences of p (i.e. it will parse as many occurrences of p as possible and then stop), and many p parses zero or more occurrences.

Laws

Of course, instances of Alternative should satisfy the monoid laws

empty <|> x = x x <|> empty = x ( x <|> y ) <|> z = x <|> ( y <|> z )

The documentation for some and many states that they should be the "least solution" (i.e. least in the definedness partial order) to their characterizing, mutually recursive default definitions. However, this is controversial, and probably wasn't really thought out very carefully.

Since Alternative is a subclass of Applicative , a natural question is, "how should empty and (<|>) interact with (<*>) and pure ?"

Almost everyone agrees on the left zero law (though see the discussion of the right zero law below):

empty <*> f = empty

After this is where it starts to get a bit hairy though. It turns out there are several other laws one might imagine adding, and different instances satisfy different laws.

Right Zero: Another obvious law would be f <*> empty = empty This law is satisfied by most instances; however, it is not satisfied by IO . Once the effects in f have been executed, there is no way to roll them back if we later encounter an exception. Now consider the Backwards applicative transformer from the transformers package. If f is Applicative , then so is Backwards f ; it works the same way but performs the actions of the arguments to (<*>) in the reverse order. There is also an instance Alternative f => Alternative (Backwards f) . If some f (such as IO ) satisfies left zero but not right zero, then Backwards f satisfies right zero but not left zero! So even the left zero law is suspect. The point is that given the existence of Backwards we cannot privilege one direction or the other.





Left Distribution: ( a <|> b ) <*> c = ( a <*> c ) <|> ( b <*> c ) This distributivity law is satisfied by [] and Maybe , as you may verify. However, it is not satisfied by IO or most parsers. The reason is that a and b can have effects which influence execution of c , and the left-hand side may end up failing where the right-hand side succeeds. For example, consider IO , and suppose that a always executes successfully, but c throws an I/O exception after a has run. Concretely, say, a might ensure that a certain file does not exist (deleting it if it does exist or doing nothing if it does not), and then c tries to read that file. In that case (a <|> b) <*> c will first delete the file, ignoring b since a is successful, and then throw an exception when c tries to read the file. On the other hand, b might ensure that the same file in question does exist. In that case (a <*> c) <|> (b <*> c) would succeed: after (a <*> c) throws an exception, it would be caught by (<|>) , and then (b <*> c) would be tried. This law does not hold for parsers for a similar reason: (a <|> b) <*> c has to "commit" to parsing with a or b before running c , whereas (a <*> c) <|> (b <*> c) allows backtracking if a <*> c fails. In the particular case that a succeeds but c fails after a but not after b , these may give different results. For example, suppose a and c both expect to see two asterisks, but b expects to see only one. If there are only three asterisks in the input, b <*> c will be successful whereas a <*> c will not.

Right Distribution: a <*> ( b <|> c ) = ( a <*> b ) <|> ( a <*> c ) This law is not satisfied by very many instances, but it's still worth discussing. In particular the law is still satisfied by Maybe . However, it is not satisfied by, for example, lists. The problem is that the results come out in a different order. For example, suppose a = [(+1), (*10)] , b = [2] , and c = [3] . Then the left-hand side yields [3,4,20,30] , whereas the right-hand side is [3,20,4,30] . IO does not satisfy it either, since, for example, a may succeed only the second time it is executed. Parsers, on the other hand, may or may not satisfy this law, depending on how they handle backtracking. Parsers for which (<|>) by itself does full backtracking will satisfy the law; but for many parser combinator libraries this is not the case, for efficiency reasons. For example, parsec fails this law: if a succeeds while consuming some input, and afterwards b fails without consuming any input, then the left-hand side may succeed while the right-hand side fails: after (a <*> b) fails, the right-hand side tries to re-run a without backtracking over the input the original a consumed.

Left Catch: ( pure a ) <|> x = pure a Intuitively, this law states that pure should always represent a "successful" computation. It is satisfied by Maybe , IO , and parsers. However, it is not satisfied by lists, since lists collect all possible results: it corresponds to [a] ++ x == [a] which is obviously false.

This, then, is the situation: we have a lot of instances of Alternative (and MonadPlus ), with each instance satisfying some subset of these laws. Moreover, it's not always the same subset, so there is no obvious "default" set of laws to choose. For now at least, we just have to live with the situation. When using a particular instance of Alternative or MonadPlus , it's worth thinking carefully about which laws it satisfies.

Utility functions

There are a few Alternative -specific utility functions worth mentioning:

guard :: Alternative f => Bool -> f () pure () if the condition holds, and empty if not. This can be used to create a conditional failure point in the middle of a computation, where the computation only proceeds if a certain condition holds.

optional :: Alternative f => f a -> f ( Maybe a ) Maybe type: that is, optional x is a computation which always succeeds, returning Nothing if x fails and Just a if x successfully results in a . It is useful, for example, in the context of parsers, where it corresponds to a production which can occur zero or one times.

Further reading

There used to be a type class called MonadZero containing only mzero , representing monads with failure. The do -notation requires some notion of failure to deal with failing pattern matches. Unfortunately, MonadZero was scrapped in favor of adding the fail method to the Monad class. If we are lucky, someday MonadZero will be restored, and fail will be banished to the bit bucket where it belongs (see MonadPlus reform proposal). The idea is that any do -block which uses pattern matching (and hence may fail) would require a MonadZero constraint; otherwise, only a Monad constraint would be required.

A great introduction to the MonadPlus type class, with interesting examples of its use, is Doug Auclair’s MonadPlus: What a Super Monad! in the Monad.Reader issue 11.

Another interesting use of MonadPlus can be found in Christiansen et al, All Sorts of Permutations, from ICFP 2016.

The logict package defines a type with prominent Alternative and MonadPlus instances that can be used to efficiently enumerate possibilities subject to constraints, i.e. logic programming; it's like the list monad on steroids.

Foldable

The Foldable class, defined in the Data.Foldable module (haddock), abstracts over containers which can be “folded” into a summary value. This allows such folding operations to be written in a container-agnostic way.

Definition

The definition of the Foldable type class is:

class Foldable t where fold :: Monoid m => t m -> m foldMap :: Monoid m => ( a -> m ) -> t a -> m foldr :: ( a -> b -> b ) -> b -> t a -> 