Introduction

This introduction is presented by means of examples rather than theory, and assumes a little knowledge of Haskell.

Maybe

> data Maybe a = Just a | Nothing

> f :: a -> Maybe b

Example

> doQuery :: Query -> DB -> Maybe Record

To perform a sequence of queries (where the results of one query form part of the next query), the programmer has to explicitly check for failure after each query :



> r :: Maybe Record > r = case doQuery db q1 of > Nothing -> Nothing > Just r1 -> case doQuery db (q2 r1) of > Nothing -> Nothing > Just r2 -> case doQuery db (q3 r2) of > Nothing -> Nothing > Just r3 -> ...

> thenMB :: Maybe a -> (a -> Maybe b) -> Maybe b > mB `thenMB` f = case mB of > Nothing -> Nothing > Just a -> f a

> r :: Maybe Record > r = doQuery q1 db `thenMB` \r1 -> > doQuery (q2 r1) db `thenMB` \r2 -> > doQuery (q3 r2) db `thenMB` ....

State

> type StateT s a = s -> (a,s)

Back to the (slightly contrived) Example...

> addRec :: Record -> DB -> (Bool,DB) > delRec :: Record -> DB -> (Bool,DB)

> addRec :: Record -> StateT DB Bool > delRec :: Record -> StateT DB Bool

> newDB :: StateT DB Bool > newDB db = let (bool1,db1) = addRec rec1 db > (bool2,db2) = addRec rec2 db1 > (bool3,db3) = delRec rec3 db2 > in (bool1 && bool2 && bool3,db3)

Learning from the experience of Maybe, the wise programmer will likewise define a combinator to sequence together a series of state transformers:

> thenST :: StateT s a -> (a -> StateT s b) -> StateT s b > st `thenST` f = \s -> let (v,s') = st s > in f v s'

> returnST :: a -> StateT s a > returnST a = \s -> (a,s)

> newDB :: StateT DB Bool > newDB = addRec rec1 `thenST` \bool1 -> > addRec rec2 `thenST` \bool2 -> > delRec rec3 `thenST` \bool3 -> > returnST (bool1 && bool2 && bool3)

It's obvious that the style of programming shown above - using combinators to manage parameter passing or computational flow - is a powerful technique for structuring code and results in clearer programs. In fact, the same ideas can be used for many other computational idioms:

Data Structures : lists, trees, sets.

Computational Flow : Maybe, Error Reporting, non-determinism

Value Passing : StateT, environment variables, output generation

Interaction with external state: IO, GUI programming, foreign language interfaces

More Exotic stuff : parsing combinators, concurrency, mutable data structures.

Where's the Monad?

Luckily, it was realised that all these examples correspond to the mathematical notion of a monad. For our purposes, a monad is a triple of a type and then & return operators defined over it so that the following laws apply:

return a `then` f === f a m `then` return === m m `then` (\a -> f a `then` h) === (m `then` f) `then` h

The Monad Class

> class Monad m where > >>= :: m a -> (a -> m b) -> m b > >> :: m a -> m b -> m b > return :: a -> m a > > m >> k = m >>= \_ -> k

Now, any type with combinators that obey the above laws can be made an instance of the Monad class. In the case of Maybe this is

> instance Monad Maybe where > (>>=) = thenMB > return a = Just a

Technical Notes: Haskell does not allow class instances for type synonyms, so we'd need to re-define StateT using a data declaration, and alter the definitions of thenST and returnST to accomdate the type constructor.

using a declaration, and alter the definitions of and to accomdate the type constructor. For all instances of the Monad class, the above laws must hold. However, Haskell compilers have no way of enforcing this. Therefore, there is a programmer proof obligation when declaring new instances

As all monads now have a common notation, combinators that operate over all monads can now be defined. The prelude contains a few, and there are more in the statdard library module 'Monad'. An example from the prelude is sequence: it takes a list of monadic computations, executes each one in turn and returns the list of their results. Using the combinators of the Monad class, it can be defined as follows.

> sequence :: Monad m => [m a] -> m [a] > sequence [] = return [] > sequence (c:cs) = c >>= \x -> > sequence cs >>= \xs -> > return (x:xs)

Do notation

> accumulate :: Monad m => [m a] -> m [a] > accumulate [] = return [] > accumulate (c:cs) = do x <- c > xs <- accumulate cs > return (x:xs)

The do keyword introduces a series of monadic computations, delimited by the offside rule, as with other Haskell blocks such as let expressions.

keyword introduces a series of monadic computations, delimited by the offside rule, as with other Haskell blocks such as expressions. <- is used to bind a variable to the result of a monadic computation (instead of using >>= )

is used to bind a variable to the result of a monadic computation (instead of using ) Computations whose result are ignored are simply set in line with the offset rule (no need for >> )

) The do statement must finish with a return or a monadic computation (not a <-)

Further Monadic Classes

Prelude

Monad

Monadic IO

> getChar :: FileHandle -> Char

One solution to this is to pass the state explicitly to getChar and return a new state. Generalising, we can represent the state of the entire world by a type World. The function would now have type:

> getChar :: FileHandle -> World -> (Char,World)

A solution taken by some languages (such as Clean) is to extend the type system to ensure that World values are only used once - Uniqueness Types

However, Haskell solves the single-use problem using monads. Notice that getChar is simply a state transformer (over the state of the external world) and can be rewritten as

> getChar :: FileHandle -> StateT World Char

> data IO a = IO (StateT World a)

> getChar :: FileHandle -> IO Char

NB: The type IO () is used to denote an IO action that returns no interesting result, i.e. is only important for it's side-effects. An example is the dual of getChar:

putChar :: Char -> IO ()

Programming in the IO Monad

IO

As well as IO functions which operate on a per-character basis (as is typical in imperative languages) there are powerful functions such as:

> getContents :: IO String > readFile :: FilePath -> IO String > writeFile :: FilePath -> String -> IO ()

stdin

This is a nice way to get input into a program while still keeping the majority of the program monad-free. A simple example is the UNIX utility wc, a simplified version of which is:

> import System (getArgs) > main :: IO () > main = do args <- getArgs > case args of > [fname] -> do fstr <- readFile fname > let nWords = (length . words) fstr > nLines = (length . lines) fstr > nChars = length fstr > (putStrLn . unwords) [show nLines, show nWords > , show nChars,fname] > _ -> putStrLn "usage: wc fname"

Notes getArgs :: IO [String] returns the list of commandline arguments (like C's argv ).

returns the list of commandline arguments (like C's ). We then check that the commandline is valid (i.e. contains one item, a filename). If it isn't we print a message and terminate. Otherwise the file is opened lazily by readFile and the statistics calculated.

and the statistics calculated. The statistics are produced using length :: [a] -> Int to count the length of the lists of words, lines and characters in the file. The first two are produced using the Prelude functions words :: String -> [String] and lines :: String -> [String]

to count the length of the lists of words, lines and characters in the file. The first two are produced using the functions and The statistics are formatted using show :: Show a => a -> String and unwords :: [String] -> String before being output by putStrLn

and before being output by Notice the use of nested do blocks, the interaction of layout between case and do statements, and the do notation's version of a let expression (whose extent is defined by layout, so doesn't require a terminating in ).

blocks, the interaction of layout between and statements, and the notation's version of a expression (whose extent is defined by layout, so doesn't require a terminating ). Observe that we must perform the getArgs action and get a result to pass to the case statement. Doing something like case getArgs of .. won't work - getArgs has type IO [String] , not [String] . As IO is an ADT, we can't pattern match against values of this type.

action and get a result to pass to the statement. Doing something like won't work - has type , not . As is an ADT, we can't pattern match against values of this type. This solution is inefficient -- it traverses the file string five times. A more efficient implementation would calculate the statistics simultaneously, either using a hand-coded recursive function, or by using foldr.

Summary and Further Reading

There's a wealth of publications available about monads. However, much of it is aimed at a different audience than 'Joe Programmer'. Here's a quick summary of some web-accessible documentation:

Theory : Phillip Wadler is one of the guys who started this whole monad business. blame him ;). He maintains a comprehensive list of his monad-based publications. You may find some of these hard-going: they get advanced quickly, and the notation used varies from that commonly used in Haskell. However, certainly worth a look.

: Phillip Wadler is one of the guys who started this whole monad business. blame him ;). He maintains a comprehensive list of his monad-based publications. You may find some of these hard-going: they get advanced quickly, and the notation used varies from that commonly used in Haskell. However, certainly worth a look. Monadic Parser Combinators : Graham Hutton and Erik Meijer have published a paper on this topic which serve as a tutorial to the subject, and also describes in general the use of monads to structure functional programs. Libraries of the parser combinators are also provided. Definately one to read.

: Graham Hutton and Erik Meijer have published a paper on this topic which serve as a tutorial to the subject, and also describes in general the use of monads to structure functional programs. Libraries of the parser combinators are also provided. Definately one to read. The rest: Simon Peyton Jones has a fine list of papers published by himself and colleagues. Especially relevant are the sections on foreign language integration, monads, state & concurrency, and graphical user interfaces.

thanks to Steve Messick, Joe English, Richard Watson, Lazlo Nemeth, Alain Van Kern and Dean Harrington for their comments