This is part 2 of a multipart series where we will look at getting a website / blog set up with hakyll and customized a fair bit.

Overview

Adding a Sitemap Template

A sitemap.xml template, just like the templates in the last post, receives context fields to work with (variables, essentially), and outputs the result of applying said context to the template. Here is what our sitemap template will look like today in our project’s templates/sitemap.xml :

<?xml version="1.0" encoding="UTF-8" ?> version="1.0" encoding="UTF-8" <urlset xmlns= "http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news= "http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml= "http://www.w3.org/1999/xhtml" xmlns:mobile= "http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image= "http://www.google.com/schemas/sitemap-image/1.1" xmlns:video= "http://www.google.com/schemas/sitemap-video/1.1" > <url> <loc> $root$ </loc> $root$ <changefreq> daily </changefreq> daily <priority> 1.0 </priority> 1.0 </url> $for(pages)$ <url> <loc> $root$$url$ </loc> $root$$url$ <lastmod> $if(updated)$$updated$$else$$if(date)$$date$$endif$$endif$ </lastmod> $if(updated)$$updated$$else$$if(date)$$date$$endif$$endif$ <changefreq> weekly </changefreq> weekly <priority> 0.8 </priority> 0.8 </url> $endfor$ </urlset>

Apart from the normal sitemap boilerplate, you can see root , pages , url , date and updated context fields. While date and updated would come from your metadata fields defined for a post, and the url is built from hakyll’s defaultContext , the root and pages fields are custom defined in what will be our very own sitemapCtx context. In the next section, we’ll use this template to generate our sitemap.xml file.

Generating the Sitemap XML File

If you create a hakyll project from scratch, you will start out with a few files that we can add to our sitemap:

index.html

about.rst

contact.markdown

posts/2015-08-12-spqr.html

posts/2015-10-07-rosa-rosa-rosam.html

posts/2015-11-28-carpe-diem.html

posts/2015-12-07-tu-quoque.html

You should note that your site.hs file also has the following:

main :: IO () () = hakyllWith config $ do mainhakyllWith config -- ... "about.rst" , "contact.markdown" ]) $ do match (fromList []) $ setExtension "html" routesetExtension $ pandocCompiler compilepandocCompiler >>= loadAndApplyTemplate "templates/default.html" defaultContext loadAndApplyTemplatedefaultContext "posts/*" $ do match $ setExtension "html" routesetExtension $ pandocCompiler compilepandocCompiler >>= loadAndApplyTemplate "templates/post.html" postCtx loadAndApplyTemplatepostCtx >>= loadAndApplyTemplate "templates/default.html" postCtx loadAndApplyTemplatepostCtx

It’s important that you understand that any files you want to be loaded and sent to templates/sitemap.xml must first be match ed and compile d before the sitemap can be built. If you don’t do this, you’ll pull your hair out wondering why the file (or folder) you’re trying to include in the sitemap never shows up.

Now, there is something that we are going to emulate to make this sitemap a reality (this should already be in site.hs ):

main :: IO () () = hakyllWith config $ do mainhakyllWith config -- ... "archive.html" ] $ do create [ route idRoute $ do compile <- recentFirst =<< loadAll "posts/*" postsrecentFirstloadAll let archiveCtx = archiveCtx "posts" postCtx ( return posts) `mappend` listFieldpostCtx (posts) "title" "Archives" `mappend` constField defaultContext "" makeItem >>= loadAndApplyTemplate "templates/archive.html" archiveCtx loadAndApplyTemplatearchiveCtx >>= loadAndApplyTemplate "templates/default.html" archiveCtx loadAndApplyTemplatearchiveCtx

Reading the code above, this essentially says

here’s a file we want to create that does not yet exist (how create differs from match ) when you create the route, keep the filename (what idRoute does) when you compile, load all the posts, specify what the context to send to each template will be, then make the item (the "" is an identifier… see the source for more), then pass the context to the archive template and pass that on to the default template, ultimately building up a full webpage from the inside-out

Let’s change this 3-step rule to suit our needs before we wrangle the code. We want our rules to say:

here’s a file we want to create that does not yet exist ( sitemap.xml ) when you create the route, keep the filename (what idRoute does) when you compile, load all the posts, load all the other pages, specify what the context to send to each template will be, then make the item, then pass the context to the sitemap template, ultimately building up an XML file

This is almost the same! Let’s write it:

main :: IO () () = hakyllWith config $ do mainhakyllWith config -- ... "sitemap.xml" ] $ do create [ route idRoute $ do compile -- load and sort the posts <- recentFirst =<< loadAll "posts/*" postsrecentFirstloadAll -- load individual pages from a list (globs DO NOT work here) <- loadAll (fromList [ "about.rst" , "contact.markdown" ]) singlePagesloadAll (fromList []) -- mappend the posts and singlePages together let pages = posts <> singlePages pagespostssinglePages -- create the `pages` field with the postCtx -- and return the `pages` value for it = listField "pages" postCtx ( return pages) sitemapCtxlistFieldpostCtx (pages) -- make the item and apply our sitemap template "" makeItem >>= loadAndApplyTemplate "templates/sitemap.xml" sitemapCtx loadAndApplyTemplatesitemapCtx

This is starting to look good! But what’s wrong here? Remember the root context bits? We’re going to need to define what that is, and the best way that I’ve found right now is simply as a String ; if you want to do something fancy with configuration or reading it in dynamically, then go nuts.

root :: String = "https://ourblog.com" root

With that defined, we can add it to our contexts:

main :: IO () () = hakyllWith config $ do mainhakyllWith config -- ... "sitemap.xml" ] $ do create [ route idRoute $ do compile <- recentFirst =<< loadAll "posts/*" postsrecentFirstloadAll <- loadAll (fromList [ "about.rst" , "contact.markdown" ]) singlePagesloadAll (fromList []) let pages = posts <> singlePages pagespostssinglePages = sitemapCtx "root" root <> -- here constFieldroot "pages" postCtx ( return pages) listFieldpostCtx (pages) "" makeItem >>= loadAndApplyTemplate "templates/sitemap.xml" sitemapCtx loadAndApplyTemplatesitemapCtx -- ... postCtx :: Context String = postCtx "root" root <> -- here constFieldroot "date" "%Y-%m-%d" <> dateField defaultContext

Hint: if the <> is throwing you for a loop, it’s defined as the same as thing as mappend .

See how we defined constField "root" root in two places? We’re talking about two different contexts here: the sitemap context and the post context. While you could have the postCtx be combined with the sitemapCtx , thus giving the pages field access to the root field, you probably want to use root (and perhaps other constants) wherever you work with posts, so adding them to postCtx for use everywhere seems like the right thing to do.

Once you’ve got all this, run the following to build (or rebuild) your docs/sitemap.xml file:

λ stack build λ stack exec site clean λ stack exec site build

Your docs/sitemap.xml should now have all your pages defined in it!

Adding Other Pages and Directories

We’ve done some epic traveling in New Zealand and now want to include a bunch of pages we’ve written in the sitemap. Those pages are:

new-zealand/index.md

new-zealand/otago/index.md

new-zealand/otago/dunedin-area.md

new-zealand/otago/queenstown-area.md

new-zealand/otago/wanaka-area.md

First, we make sure that our pages get compiled (we’ll use postCtx for them):

main :: IO () () = hakyllWith config $ do mainhakyllWith config -- ... "new-zealand/**" $ do match $ setExtension "html" routesetExtension $ pandocCompiler compilepandocCompiler >>= loadAndApplyTemplate "templates/post.html" postCtx loadAndApplyTemplatepostCtx >>= loadAndApplyTemplate "templates/default.html" postCtx loadAndApplyTemplatepostCtx

And then we want to make sure we add them to our create function:

main :: IO () () = hakyllWith config $ do mainhakyllWith config -- ... match code up here "sitemap.xml" ] $ do create [ route idRoute $ do compile <- recentFirst =<< loadAll "posts/*" postsrecentFirstloadAll <- loadAll (fromList [ "about.rst" , "contact.markdown" ]) singlePagesloadAll (fromList []) <- loadAll "new-zealand/**" -- here nzPagesloadAll let pages = posts <> singlePages <> nzPages -- here pagespostssinglePagesnzPages = sitemapCtx "root" root <> constFieldroot "pages" postCtx ( return pages) listFieldpostCtx (pages) "" makeItem >>= loadAndApplyTemplate "templates/sitemap.xml" sitemapCtx loadAndApplyTemplatesitemapCtx

I could not figure out how to mix globs ( new-zealand/** ) in with individual file paths (included in fromList ), so I had to load them separately; if you figure out how, let me know!

Once you’ve got all this, run the following to rebuild your docs/sitemap.xml file:

λ stack build λ stack exec site rebuild

Wrapping Up

In this lesson we learned how to dynamically generate a sitemap.xml file using hakyll. Next time, we’ll use these same skills to generate our own RSS and Atom XML feeds.

Next up:

Thank you for reading!

Robert