Someone I met at BayHac e-mailed me asking for some help with state. He was generating an HTML page and wanted to be able to number all the figures consecutively across the whole page. I ended up writing a little tutorial code for him on threading state, both directly and with the State monad, and thought I’d share it here.

The Haskell source in this post is available as a single file, which you can load into ghci and play with. Play along while you read!



For an example case, we’ll take a simpler problem: Consider lists of items that are either plain strings, or strings that will need a sequence number:

data SimpleItem = SPlain String | SNumbered String

Our goal is to produce a function that takes a list of SimpleItem and produces a list of String , where the SNumbered items have consecutive numbers prepended. Here is some sample input and output:

testSimpleInput :: [SimpleItem] testSimpleInput = [ SPlain "Upper Leg" , SNumbered "Femur" , SPlain "Knee" , SNumbered "Patella" , SPlain "Lower Leg" , SNumbered "Tibia" , SNumbered "Fibula" ] testSimpleOutput :: [String] testSimpleOutput = [ "Upper Leg" , "1. Femur" , "Knee" , "2. Patella" , "Lower Leg" , "3. Tibia" , "4. Fibula" ]

For this simple data type, the easiest and most common way is to simply carry the state as part of the iteration:

renderSimple :: [SimpleItem] -> [String] renderSimple items = step 1 items where step :: Int -> [SimpleItem] -> [String] step n (SPlain s : rest) = s : step n rest step n (SNumbered s : rest) = number n s : step (n+1) rest step _ [] = [] number :: Int -> String -> String number n s = show n ++ ". " ++ s

With a richer data type, this technique can be difficult. Consider a version that has sub-lists, yet we still want the numbers to be consecutive globally. Here’s a data type and some sample input and output:

data RichItem = RPlain String | RNumbered String | RSubList String [RichItem] testRichInput :: [RichItem] testRichInput = [ RSubList "== Arm ==" [ RPlain "Upper Arm" , RNumbered "Humerus" , RPlain "Lower Arm" , RNumbered "Radius" , RNumbered "Ulna" ] , RSubList "== Leg ==" [ RPlain "Upper Leg" , RNumbered "Femur" , RPlain "Knee" , RNumbered "Patella" , RPlain "Lower Leg" , RNumbered "Tibia" , RNumbered "Fibula" ] ] testRichOutput :: [String] testRichOutput = [ "== Arm ==" , "Upper Arm" , "1. Humerus" , "Lower Arm" , "2. Radius" , "3. Ulna" , "== Leg ==" , "Upper Leg" , "4. Femur" , "Knee" , "5. Patella" , "Lower Leg" , "6. Tibia" , "7. Fibula" ]

The explicit carrying of state along can be unwieldly if there are alot of state, if there are several places that might update the state, or if there are many level of functions only some of which need the state. In this case, because of the sublists, the state at the end of each step must be returned, and factored in (because a sublist could have any number of numbered items):

renderRich :: [RichItem] -> [String] renderRich items = fst $ step 1 items where step :: Int -> [RichItem] -> ([String], Int) step n (RPlain s : rest) = let (r, n') = step n rest in (s : r, n') step n (RNumbered s : rest) = let (r, n') = step (n+1) rest in (number n s : r, n') step n (RSubList s subs : rest) = let (rsub, n') = step n subs (r, n'') = step n' rest in (s : rsub ++ r, n'') step n [] = ([],n)

Notice how the updated state had to get returned from step .

Render can be rewritten using the State monad. The monad used here is given at the end of this post, though we could have used the one in the library.

The type State s is a monad that carries a state value of type s . Like all monads, to write a function “in the monad” means to return its value wrapped in the monad. These functions will have type State s v , meaning a computation with state value of type s , and a return value of type v .

renderRichState :: [RichItem] -> [String] renderRichState items = evalState (render items) 1 where render :: [RichItem] -> State Int [String] render (RPlain s : rest) = do r <- render rest return $ s : r render (RNumbered s : rest) = do n <- takeNext r <- render rest return $ number n s : r render (RSubList s subs : rest) = do rsub <- render subs --(*) r <- render rest return $ s : rsub ++ r render [] = return [] takeNext :: State Int Int takeNext = do n <- get put (n+1) return n

The structure is the same as renderRich , but the details of threading the state through the functions has been hidden neatly inside the monad. Take some time to compare this version to the prior and see how neatly the monad makes the code. Also compare it to renderSimple , and see how the structure is similar.

Notice that the recursive call to render the sub-lists must be to render . If it were renderRichState — then that sub-computation would have it’s own state value, and the numbers wouldn’t increment globally. Try replacing the starred (*) line with: let rsub = renderRichState subs and see!

To show how, once inside the monad, you must stay within the monad, we can re-write this using helper functions (even if this problem is probably too small to warrant them):

renderRichState' :: [RichItem] -> [String] renderRichState' items = evalState (render items) 1 where render :: [RichItem] -> State Int [String] render items = do rs <- mapM renderItem items return $ concat rs renderItem :: RichItem -> State Int [String] renderItem (RPlain s) = renderPlain s renderItem (RNumbered s) = renderNumbered s renderItem (RSubList s subs) = renderSubList s subs renderPlain :: String -> State Int [String] renderPlain s = return [s] renderNumbered :: String -> State Int [String] renderNumbered s = do n <- takeNext return [number n s] renderSubList :: String -> [RichItem] -> State Int [String] renderSubList s subs = do rsub <- render subs return $ s : rsub

Notice how any function that needs to use or modify the state, or call one that might, has to return a result in the monad, not a simple value.

Finally, here is the definition of the simple State monad used. You could use the one in Control.Monad.State identically. This one is written for exposition – many of the functions would be shorter (!) in real code. While it might take some time to wrap your head around this code, it is worth it. The only tricky part is getting over the hump of a data type that carries a function inside.

-- | A State is a computation that takes an input state (type s), -- and produces a result value (type a), and a new state. data State s a = State (s -> (a,s)) -- | Return the state value get :: State s s get = State $ \s -> (s,s) -- | Set the state value put :: s -> State s () put v = State $ \s -> ((),v) -- | Run the computation, given an initial state, -- produce the result and an updated state. runState :: State s a -> s -> (a,s) runState (State f) s0 = f s0 -- | Run the computation, given an initial state, -- return only the result evalState :: State s a -> s -> a evalState (State f) s0 = let (r,s') = f s0 in r instance Monad (State s) where return v = State $ \s -> (v,s) f >>= g = State $ \s -> let (v, s') = runState f s (w, s'') = runState (g v) s' in (w, s'')

If you’re not used to this kind of code, I hope you enjoy the brain twisting!