Heist, Anansi, MVars, oh my!

Heist, Anansi, MVars, oh my!

Henry Laxen

October 18, 2011

I've been using Haskell for several years now, and I think I'm starting to get the hang of it. I'm writing this to help my fellow Haskell strugglers understand in a few hours what took me several days to wade through. I've been using the Snap Framework to run this website, and in particular I wanted to get better understanding of the Heist Templating system, so I decided to run a few experiments.

One thing my thirty years of programming experience has taught me is that you generally write code between one and five times, but you read it many hundreds of times. Thus the time you invest to make your code easier to understand will pay off many times over. So even though Haskell abhors side effects, a side effect of this tutorial will be to introduce you a wonderful literate programming system called anansi, written by John Millikin. Don't worry about it now, we'll get to it later. I suggest you find a spot in your file system where you want to have a directory named heistTutorial, then download and unpack the archive there. If you want to rebuild this project from scratch, typing make should download all of the dependencies and recreate the existing src directory. This is not necessary to proceed with this tutorial. Now cd into the heistTutorial/src directory, and fire up a ghci session in one window while following along on this page to play along and explore.

tar -xzf heistTutorial.tgz cd heistTutorial # optional make cd src ghci

The test data «test1.tpl» before 1 < test > inside 1 </ test > after 1 < test /> end 1 «test2.tpl» before 2 < test file = "data.txt" > inside 2 </ test > after 2 < test /> end 2 «notTest.tpl» before < notTest > inside </ notTest > after

«data.txt» This is the contents of data . txt



:load onLoad.hs

Understanding Heist hook functions

«onLoad testHook» testHook :: Template -> IO Template testHook t = do B8.putStrLn "inside testHook" print t return t Here we add testHook, defined above, to the list of onLoad hooks, and bind the function testImpl to the xml tag <test> «onLoad bindTestTag» bindTestTag :: HeistState IO -> IO ( HeistState IO ) bindTestTag = do return . addOnLoadHook testHook . bindSplice "test" ( testImpl ) This is the code that is run whenever we encounter an <test> tag in a template. It just prints out a message and returns the empty list, effectively removing the <test> tag and its descendents from the output. «onLoad testImpl» testImpl :: Splice IO testImpl = do liftIO $ B8.putStrLn "in testImpl" return [] runIt prints out the name of the template we are about to render, then renders it and displays the output. One thing to remember is that renderTemplate returns an IO (Maybe (Builder, MIMEType)), so line that reads B8.putStrLn $ toByteString . fst $ r has to do some unpacking in order to print out the result we want. «onLoad runIt» runIt :: HeistState IO -> B8.ByteString -> IO () runIt ts name = do B8.putStrLn $ "Running template " `B8.append` name B8.putStrLn "----------------------------------------------" result <- renderTemplate ts name let r = maybe ( error "render error" ) id result B8.putStrLn $ toByteString . fst $ r Finally, main sets up the original Template State, which adds an onLoad hook function and our <test> tag, loads all the templates in the templates directory and renders the two templates, test1 and notTest. «onLoad main» main :: IO () main = do originalTS <- bindTestTag defaultHeistState ets <- loadTemplates "templates" originalTS let ts = either error id ets runIt ts "test1" runIt ts "notTest" A hook function must take a Template to an IO Template. So testHook does nothing more that print a message when it is entered, then print and return the template it was passed.Here we add testHook, defined above, to the list of onLoad hooks, and bind the function testImpl to the xml tag This is the code that is run whenever we encounter an tag in a template. It just prints out a message and returns the empty list, effectively removing the tag and its descendents from the output.runIt prints out the name of the template we are about to render, then renders it and displays the output. One thing to remember is thatreturns an IO (Maybe (Builder, MIMEType)), so line that reads B8.putStrLn $ toByteString . fst $ r has to do some unpacking in order to print out the result we want.Finally, main sets up the original Template State, which adds an onLoad hook function and our tag, loads all the templates in the templates directory and renders the two templates, test1 and notTest.

«testHook output» 1 inside testHook 2 [ TextNode "before

" , Element { elementTag = "notTest" , elementAttrs = [] , elementChildren = [ TextNode "

inside

" ] } , TextNode "

after

" ] 3 inside testHook 4 [ TextNode "before 1

" , Element { elementTag = "test" , elementAttrs = [] , elementChildren = [ TextNode "

inside 1

" ] } , TextNode "

after 1

" , Element { elementTag = "test" , elementAttrs = [] , elementChildren = [] } , TextNode "

end 1

" ] 5 inside testHook 6 [ TextNode "before 2

" , Element { elementTag = "test" , elementAttrs = [ ( "file" , "data.txt" ) ] , elementChildren = [ TextNode "

inside 2

" ] } , TextNode "

after 2

" , Element { elementTag = "test" , elementAttrs = [] , elementChildren = [] } , TextNode "

end 2

" ] 7 Running template test1 8 ---------------------------------------------- 9 in testImpl 10 in testImpl 11 before 1 12 13 after 1 14 15 end 1 16 17 Running template notTest 18 ---------------------------------------------- 19 before 20 < notTest > 21 inside 22 </ notTest > 23 after

Let's start by looking at the test input files. That way, when we see the code we will understand what it is working on. The test files are all Heist templates, so if you don't know anything about Heist, this would be a good place to start. There are four test files:Now in you ghci session let's load the first example.Now lets look at our first program. I wasn't clear on how or what the Heist hook functions did, so I write a little program to check them out.Running main in our ghci session results in the following output (line numbers added):

Let's see if we can understand what is going on here. The line in main that has loadTemplates causes the three files with a .tpl extension in the templates directory to be loaded. Lines 1-6 indicate that they are loaded in the order notTest.tpl, test1.tpl, and test2.tpl. Our function testHook, was called three times, once for each of those templates, and its input parameter is the parsed contents of the file. This gives us to opportunity to modify any template file as we see fit, after it has been loaded.

Next we render test1, which produces the output in lines 7-16. There are two <test> tags in the test1.tpl file, one which has a text node among its children, and the other which is just a plain <test/> element. Thus lines 9 and 10 show us that testImpl is called twice while rendering this template. The result of the rendering is lines 11 to 16, in which all traces of the <test> tag and its children have been removed. You will notice that in line 4 test1.tpl was parsed, and the TextNodes containing the "before", "after", and "end" words all have linefeeds in them, hence the double spacing.

Finally we render notTest, which produces the output in lines 17 to 23. Nothing too surprising here, there is no <test> tag so the output is the same as the input.

At this point I hope you have a pretty good understanding of what Heist does, and how to set up hooks, bindings, and splices. You'll notice from the type signatures, that hooks and splices can run in the IO monad, so the world is your oyster. Next we'll take a look at some of the things this lets us do.

Playing with MVars

Even though I've been using Haskell for a few years now, I've never had the occasion to use MVars. I have to admit, I was a little scared of them. While browsing through the Heist code for the splice Static.hs, I noticed MVars and IORefs all over the place. I realized it was time to figure them out. This would be a good time to run :load mvarFile1.hs in your ghci session. Here is the new code that you loaded. The runIt code is the same as before.

«mvarFile mvStatus» mvStatus :: MVar a -> IO () mvStatus mv = do empty <- liftIO $ isEmptyMVar mv print $ "isEmpty is " ++ ( show empty ) The thing to notice here is that we create an empty MVar which we then pass to testImpl as part of the splice. This means our splices can suddenly have access to their own private data. «mvarFile bindTestTag» bindTestTag :: HeistState IO -> IO ( HeistState IO ) bindTestTag ts = do mv <- liftIO $ newEmptyMVar return . bindSplice "test" ( testImpl mv ) $ ts Here is where we take advantage of the MVar created above. node contains the entire parsed input of the <test> tag. path contains the full file path of the template currently being processed. Next we check to see if the <test> tag has a "file" attribute (assumed to be relative). If it does, we read the file and stick it in the MVar. We return [] so the <test> tag goes away in the output. The other case is that the <test> tag does not contain a "file" attribute. In that case we read the value of the MVar, and return it as a TextNode. «mvarFile testImpl» testImpl :: MVar Text -> Splice IO testImpl mv = do liftIO $ B8.putStrLn "in testImpl" lift $ mvStatus mv node <- getParamNode path <- fmap ( maybe "" id ) getTemplateFilePath case getAttribute "file" node of Just f -> do liftIO $ print $ "Got Just " ++ ( show f ) let fileName = ( FP.directory . FP.decodeString $ path ) </> ( FP.fromText f ) contents <- liftIO $ readTextFile fileName liftIO $ putMVar mv contents return [] Nothing -> do liftIO $ print "Got Nothing" value <- liftIO $ readMVar mv return [ X.TextNode value ] main just renders the templates. «mvarFile main noThreads» main :: IO () main = do hSetBuffering stdout NoBuffering originalTS <- bindTestTag defaultHeistState ets <- loadTemplates "templates" originalTS let ts1 = either error id ets runIt ts1 "test1" runIt ts1 "notTest" runIt ts1 "test2" return () print out the status of the supplied MVarThe thing to notice here is that we create an empty MVar which we then pass to testImpl as part of the splice. This means our splices can suddenly have access to their own private data.Here is where we take advantage of the MVar created above. node contains the entire parsed input of the tag. path contains the full file path of the template currently being processed. Next we check to see if the tag has a "file" attribute (assumed to be relative). If it does, we read the file and stick it in the MVar. We return [] so the tag goes away in the output. The other case is that the tag does not contain a "file" attribute. In that case we read the value of the MVar, and return it as a TextNode.main just renders the templates.

«mvarFile1 output» Running template test1 ---------------------------------------------- in testImpl "isEmpty is True" "Got Nothing"

«test1.tpl again» before 1 < test > inside 1 </ test > after 1 < test /> end 1

Now run main in you ghci session, and have a look at the output. It should look like this:At this point, you are hung, and need to Control C out. Let's see what happened. I'll reprint the "test1.tpl" data here to make it easy.

What happened is that we used the <test> tag without a file attribute before coming across a <test> tag with a file attribute. Looking at testImpl we see that it dutifully printed out that we entered it, then told us the MVar was empty, then looked for a file attribute and found Nothing. At this point the "readMVar mv" hangs, waiting for the MVar to become non-empty.

We can easily fix this by letting the three runIt calls run in parallel, with a forkIO. Go ahead and type :load mvarFile2.hs in your ghci session, which will replace main above with the following:

«mvarFile main threads» main :: IO () main = do hSetBuffering stdout NoBuffering originalTS <- bindTestTag defaultHeistState ets <- loadTemplates "templates" originalTS let ts1 = either error id ets forkIO $ runIt ts1 "test1" forkIO $ runIt ts1 "notTest" forkIO $ runIt ts1 "test2" return ()

«mvarFile2 output» 1 Running template test1 2 ---------------------------------------------- 3 in testImpl 4 "isEmpty is True" 5 Running template notTest 6 ---------------------------------------------- 7 before 8 < notTest > 9 inside 10 </ notTest > 11 after 12 13 "Got Nothing" 14 Running template test2 15 ---------------------------------------------- 16 in testImpl 17 "isEmpty is True" 18 "Got Just \"data.txt\"" 19 in testImpl 20 in testImpl 21 "isEmpty is False" 22 "Got Nothing" 23 " before 1 24 25 26 This is the contents of data . txt 27 28 after 1 29 30 31 This is the contents of data . txt 32 33 end 1 34 35 "isEmpty is True" 36 "Got Nothing" 37 before 2 38 39 after 2 40 41 42 This is the contents of data . txt 43 44 end 2

Now running main results in (a probably jumbled version of) the following (line numbers added):

Let's see if we can wade through this output. Template test1 is run first, and produces lines 1 to 4. Since the MVar is empty, it hangs, just as before. Next, notTest runs in parallel with test1. It doesn't contain any <test> tags, so it produces the output seen in lines 5 to 12. Now test2 runs also in parallel with test1 and notTest. Just before it starts running, line 13 is printed out by test1. test2 continues with lines 14-18. Since the <test> tag in test2 contains a file attribute, line 18 is printed out instead of "Got Nothing". At this time, your mileage (and output) may vary. Here it looks like test1 continues to run and writes out line 19, along with test2 rendering its second <test> tag and writing out line 20". Next test1 is running, and prints out lines 21 and 22, before displaying its output in lines 23 to 34. Lines 35 and 36 are now output by test2, along with the rendering in lines 37 to 44. The mystery is why does line 35 say the MVar is empty? I think it is because readMVar is not atomic, and test2 ran while test1 was taking the MVar and before it had a chance to put it back.

So, what is the moral of the story? You can use MVars with Heist to keep private data inside your splices. Also, look at how trivial it is to run things in parallel. While the debugging info is all jumbled up, the all of the output generated by runIt is in the right order. By the way, if you remove the hSetBuffering stdout NoBuffering line from main in mvarFile2, the output will be so horribly jumbled that it is almost impossible to make sense of it. Go ahead and give it a try.

Doing Something Useful

«wiseQuote Template» < xml > < quote file = "quotes.xml" /> before < quote > Author : < wiseQuoteAuthor /> < br />< wiseQuoteSaying /> </ quote > after Now an indexed quote before < quote index = "1" > Author : < wiseQuoteAuthor /> < br />< wiseQuoteSaying /> </ quote > after </ xml >

«quotes.xml» < doc > < quote author = "Anonymous" > The reason a dog has so many friends is that he wags his tail instead of his tongue .</ quote > < quote author = "Ann Landers" > Don't accept your dog's admiration as conclusive evidence that you are wonderful .</ quote > < quote author = "Will Rogers" > If there are no dogs in Heaven , then when I die I want to go where they went .</ quote > < quote author = "Ben Williams" > There is no psychiatrist in the world like a puppy licking your face .</ quote > < quote author = "Josh Billings" > A dog is the only thing on earth that loves you more than he loves himself .</ quote > < quote author = "Andy Rooney" > The average dog is a nicer person than the average person .</ quote > </ doc >

«wiseQuotes data» data WiseQuote = WiseQuote { wiseQuoteAuthor :: Template , wiseQuoteSaying :: Template } deriving ( Eq , Show ) Next, xmlToQuote takes a Node and converts it into a WiseQuote. It checks to see that we are processing an quote element, and then grabs the author attribute of the tag, and wraps it in a [TextNode]. Similarly, it grabs the children of the quote element, and puts them inside a WiseQuote. It checks for errors along the way. «wiseQuotes xmlToQuote» xmlToQuote :: Node -> WiseQuote xmlToQuote el = case elementTag el of "quote" -> case getAttribute "author" el of Just t -> WiseQuote [ ( X.TextNode t ) ] ( childNodes el ) _ -> error $ "Quote " ++ show el ++ " is missing an author" _ -> error $ "Element " ++ show el ++ " is not a WiseQuote" getWiseQuotes reads the xml file that contains the quotes, and filters out just the <quote> tags and their children. It then calls xmlToQuote for each <quote> tag, returning a list of WiseQuotes. «wiseQuotes getWiseQuotes» getWiseQuotes :: MonadIO m => FP.FilePath -> m [ WiseQuote ] getWiseQuotes fileName = do contents <- liftIO $ Filesystem.readFile fileName let doc = either error justQuotes $ parseXML "quotes" contents quotes = map xmlToQuote doc return quotes where justQuotes s = concat [ descendantElementsTag "quote" x | x <- docContent s ] wiseQuoteImpl is very similar to testImpl in mvarFile above. If the <quote> tag has a file attribute, it reads the file and puts the resulting list of WiseQuotes into an MVar, returning nothing which removes the <quote> tag from the output. If the file attribute is not present, it checks to see if there is an index attribute. If so, it reads the value of the "index" attribute as an Int, indexes into the WiseQuotes array which should be present in the MVar, and runs the children of this <quote> node with "wiseQuoteAuthor" bound to the author's name, and "wiseQuoteSaying" bound to the actual saying. If the "index" attribute is missing, a random number is generated between 0 and the number of available quotes, and that is used as the index. «wiseQuotes wiseQuoteImpl» wiseQuoteImpl :: MVar [ WiseQuote ] -> Splice IO wiseQuoteImpl mv = do pnode <- getParamNode path <- fmap ( maybe "" id ) getTemplateFilePath case getAttribute "file" pnode of Just f -> do let fileName = ( FP.directory . FP.decodeString $ path ) </> ( FP.fromText f ) quotes <- liftIO $ getWiseQuotes fileName liftIO $ putMVar mv quotes return [] Nothing -> do quotes <- liftIO $ readMVar mv quote <- case getAttribute "index" pnode of Just x -> return ( quotes !! ( read . unpack $ x :: Int ) ) Nothing -> do i <- liftIO $ getStdRandom ( randomR ( 0 , length quotes - 1 ) ) return ( quotes !! i ) runChildrenWithTemplates [ ( "wiseQuoteAuthor" , wiseQuoteAuthor quote ) , ( "wiseQuoteSaying" , wiseQuoteSaying quote ) ] Here we create a new empty MVar so that we can pass it to wiseQuoteImpl when a <quote> tag is encountered. «wiseQuotes bindWiseQuotes» bindWiseQuotes :: HeistState IO -> IO ( HeistState IO ) bindWiseQuotes ts = do mv <- liftIO $ newEmptyMVar return . bindSplice "quote" ( wiseQuoteImpl mv ) $ ts main just runs the template "testQuotes" shown above «wiseQuotes main» main :: IO () main = do originalTS <- bindWiseQuotes defaultHeistState ets <- loadTemplates "quotes" originalTS let ts1 = either error id ets runIt ts1 "testQuotes" The author is an attribute of the tag, and the children of the tag is the saying. Okay then, let's proceed. First we set up a data type that will be our container for our quotes.Next, xmlToQuote takes a Node and converts it into a WiseQuote. It checks to see that we are processing an quote element, and then grabs the author attribute of the tag, and wraps it in a [TextNode]. Similarly, it grabs the children of the quote element, and puts them inside a WiseQuote. It checks for errors along the way.getWiseQuotes reads the xml file that contains the quotes, and filters out just the tags and their children. It then calls xmlToQuote for each tag, returning a list of WiseQuotes.wiseQuoteImpl is very similar to testImpl in mvarFile above. If the tag has a file attribute, it reads the file and puts the resulting list of WiseQuotes into an MVar, returning nothing which removes the tag from the output. If the file attribute is not present, it checks to see if there is an index attribute. If so, it reads the value of the "index" attribute as an Int, indexes into the WiseQuotes array which should be present in the MVar, and runs the children of this node with "wiseQuoteAuthor" bound to the author's name, and "wiseQuoteSaying" bound to the actual saying. If the "index" attribute is missing, a random number is generated between 0 and the number of available quotes, and that is used as the index.Here we create a new empty MVar so that we can pass it to wiseQuoteImpl when a tag is encountered.main just runs the template "testQuotes" shown above

«wiseQuotes output» 1 Running template testQuotes 2 ---------------------------------------------- 3 < xml > 4 5 before 6 Author : Michelangelo 7 < br /> The greatest danger for most of us 8 is not that our aim is too high and we miss it , but that it is 9 too low and we reach it . 10 11 after 12 Now an indexed quote 13 before 14 15 Author : Ann Landers 16 < br /> Don't accept your dog's admiration as 17 conclusive evidence that you are wonderful . 18 19 after 20 </ xml >

About the tool used to write this

So can we do something interesting with all this machinery? One of the things I like to have at the bottom of each web page I serve is a random pithy quote. Here is the file I would like to process:What I would like to happen, is that when this file is rendered, the "quotes.xml" file is read and parsed. Then whenever we are in a tag, we replace the tag with the author of the quote, and the tag with the actual saying. If the tag has an index attribute, we use that specific quote from our list of quotes, otherwise we use a random index. In case you are wondering, here is a sampling of what the quotes.xml looks like.At this point you should typein your ghci session, and then run the main function. You should see something like the following as output:The quote in lines 6 to 10 might be different, but the one in lines 15 to 17 should always be the same.

At this point I'd like to say a few words about how this tutorial was written. All of the source code used in the examples, as well as the test data, and this html page are the result of using a literate programming tool called anansi. It allows you to write your code as though you were telling a story, and the code generated magically appears in all the right places. Perhaps you noticed, but probably not, that nowhere above did appear any import declarations. Yet in the examples you loaded with ghci, they were there. I left them out of the story because I felt they distracted from the tale I was trying to tell. Take a look at the file heistAnansiMvars.anansi in your favorite editor, and you'll see the top level of how this tutorial was generated. Notice near the bottom of the file, enclosed in html comments are a couple of include statements. I've segregated the imports and the actual code layout to the imports.anansi and codeLayout.anansi files, which are incidental to the story. Have a look at wiseQuotes.anansi to see what the actual source to typical story like this looks like. You'll notice every once in a while a line starting with a :d followed by some text. The stuff between the :d and the line containing just a : is called a macro. You can define as many macros as you want, and in whatever order you want, paying attention only to the flow of your story. Later down in the wiseQuotes.anansi document, you'll see some actual Haskell code enclosed in these macros. When this file is processed by anansi, it passes through the text outside of the macros, and then displays the text inside of the macros in your favorite output format, say html or latex. Recently syntax highlighting has been added, making the code even easier to read. In anansi terms, your files are woven together into a coherent whole, suitable for reading and understanding.

Now take a look at the file codeLayout.anansi. This file describes how to put together the macros you defined while telling the story into actual Haskell code and data. The :f in column 1 tell anansi into which file the data that follows it is supposed to go. For example, you'll see that the file onLoad.hs is composed of the following macros:

|onLoad imports||onLoad testHook||onLoad bindTestTag||onLoad testImpl||onLoad runIt||onLoad main|

Thus no matter in which order you tell your story, you can break out and reorder to code portions to make compilable Haskell code. Anansi call this process the tangle. Your story is untangled, broken into compilable pieces with the :f command, and re-tangled to generate your program. Furthermore, this allows you to replicate code that happens to be identical between different modules. In my case, I reused the |on Load runIt| macro in each of the other Haskell programs.

So please, consider using anansi for you next programming project, and we can start to turn Haskell from one of the worst documented platforms into one of the best documented platforms. Thank you for your attention.

Quote of the day:

What to do in case of an emergency: 1. Pick up your hat. 2. Grab your coat. 3. Leave your worries on the doorstep. 4. Direct your feet to the sunny side of the street.

Unknown Quote of the day:What to do in case of an emergency: 1. Pick up your hat. 2. Grab your coat. 3. Leave your worries on the doorstep. 4. Direct your feet to the sunny side of the street.

Best wishes,Henry Laxen

Go up to Haskell Go up to Home Page of Nadine Loves Henry Go back to Understanding Function Composition or continue with How to use Data.Lens