Monad

class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b

class Monad m => MonadPlus m where mzero :: m a mplus :: m a -> m a -> m a

Monad

MonadPlus

mplus

mzero

(>>=)

return

MonadPlus

It is often claimed that mplus must be associative and mzero must be its left and right unit. That is, supposedly mplus and mzero should satisfy the same laws as list concatenation and the empty list -- that is, the monoid laws. It should be stressed and stressed again that the Haskell Report says nothing about the laws of MonadPlus . Absent any other authority, the status of the MonadPlus laws is undetermined.

Do we want the monoid laws for MonadPlus ? The answer is surprisingly complicated. In a short and confusing form, it may be reasonable to use the commutative monoid laws to contemplate non-determinism and to approximate the result of a program. If one objects to the commutativity of mplus , the associativity should not be a requirement either. This article shows simple examples of this complex issue. Along the way, we discuss why we may want equational laws to start with.

An equational law is a statement of equality: for example, the left-unit monoid law for MonadPlus states that mzero `mplus` m and m must be `the same'. This may mean that given the particular definitions of mzero and mplus , the expression mzero `mplus` m reduces to m for any m . Such a fine-grained, intensional view of equality is most helpful when writing implementations and proofs. Let's take the following simple model of non-deterministic computation:

data NDet a = Fail | One a | Choice (NDet a) (NDet a)

One a

a

Fail

Choice

mzero

Fail

mplus

instance MonadPlus NDet where mzero = Fail Fail `mplus` m = m -- ... continued below

mplus

(Choice m1 m2) `mplus` m = Choice m1 (m2 `mplus` m)

(One v) `mplus` m = Choice (One v) m

MonadPlus

One does not have to take equational laws too literally. For example, with the following, even more lucid instance

instance MonadPlus NDet where mzero = Fail mplus = Choice

mzero `mplus` m

Choice Fail m

m

mzero `mplus` m

m

m

NDet

return v `mplus` m

v

mzero

return v1 `mplus` (return v2 `mplus` m)

v1

v2

NDet

run

run :: NDet a -> a run Fail = error "failure" run (One x) = x -- the intended meaning of One run (Choice (One x) m) = x -- ... continued

mzero `mplus` m

m

run

run (Choice Fail m) = run m

mplus

run (Choice (Choice m1 m2) m) = run (Choice m1 (Choice m2 m))

To recap, in this implementation, mzero `mplus` m does not reduce to m . Likewise, m1 `mplus` (m2 `mplus` m3) , which is (Choice m1 (Choice m2 m3) , is a structurally different NDet value than m1 `mplus` (m2 `mplus` m3) . Still, as programs, mzero `mplus` m and m produce the same results. Furthermore, replacing the former expression with the latter, as part of any larger expression, preserves the result of the overall program. (That claim, while intuitively true for our implementation, is very hard to prove formally.) The same can be said for m1 `mplus` (m2 `mplus` m3) and (m1 `mplus` m2) `mplus` m3 . This preservation of the results, the observational equivalence, lets a compiler, a programmer or a tool to optimize mzero `mplus` m to m . Deriving and justifying optimizations is another compelling application of equational laws.

Finally, the equational laws can be used to predict and understand the behavior of a program without regard to any particular implementation. Let consider the programs ones and main :

ones = return 1 `mplus` ones main = ones >>= \x -> if x == 1 then mzero else return x

return v `mplus` m

v

ones

1

main

main = ones >>= \x -> if x == 1 then mzero else return x === {- inline ones -} (return 1 `mplus` ones) >>= \x -> if x == 1 then mzero else return x === {- distributivity of mplus over bind -} (return 1 >>= \x -> if x == 1 then mzero else return x) `mplus` (ones >>= \x -> if x == 1 then mzero else return x) === {- the second clause of mplus is just main -} (return 1 >>= \x -> if x == 1 then mzero else return x) `mplus` main === {- monad law -} mzero `mplus` main === {- monoid law -} main

That chain of equational re-writes brought out nowhere. Clearly, no other chain does any better: there is no way to convert main either to the form return v `mplus` something or mzero . Our intuition agrees: ones keeps offering 1 , which main rejects.

Now consider onestoo and the program maintoo :

onestoo = ones `mplus` return 2 maintoo = onestoo >>= \x -> if x == 1 then mzero else return x

mplus

onestoo

onestoo = ones `mplus` return 2 === {- inlining ones -} (return 1 `mplus` ones) `mplus` return 2 === {- associativity -} return 1 `mplus` (ones `mplus` return 2) === return 1 `mplus` onestoo

Absent any other laws, we must regard maintoo just as diverging as main . This time the conclusion goes against our intuitions. We would like to interpret m `mplus` return 2 as a non-deterministic choice that includes 2 . There is therefore a chance that maintoo may produce that result. If MonadPlus is meant to represent non-determinism and to implement non-deterministic search, a search procedure ought to be complete: if it is possible maintoo may produce 2 , it should be found sooner or later. A MonadPlus instance with the associative mplus cannot be an implementation of a complete search procedure, unless we assume some sort of commutativity.

To derive the result 2 in our maintoo example, we have to posit additional laws for mplus , at the very least, m `mplus` return v === return v `mplus` m . One can even take mplus to be commutative for arbitrary arguments. The commutative and associative mplus agrees with our intuitions of non-deterministic choice. But it violently disagrees with the notion of observation used so far: return v `mplus` something is to be observed as resulting in v . With the commutative mplus , return 1 `mplus` return 2 is to be equal to return 2 `mplus` return 1 , which are observed as 1 and 2 respectively. It is not reasonable to treat 1 and 2 as the same observation. We have to re-define observation and treat return v `mplus` m as a program that might produce the result v in one particular run. The program may finish with some other result. Intuitive as it is, such notion of observation is weak, feeling like a typical disclaimer clause: the program we sold you may do nothing at all. It feels, well, non-deterministic.

It is still reasonable to assume the commutative monoid laws as a over-broad specification. If a MonadPlus implementation produces one of possibly many answers predicted by the specification, the implementation should be deemed correct. We have to be, however, explicit about such an interpretation. To avoid the confusion with the common notion of observation -- what the program actually returns -- we should not be saying that a MonadPlus is required to be be associative and commutative. A program m1 `mplus` (m2 `mplus` m3) may still print different results from (m1 `mplus` m2) `mplus` m3 .