Blogging in Haskell

Yesterday, I finally decided to figure out syntax highlighting for Haskell in my blog. I ended up finding two different ways to do it, so I’ll describe both of them, along with links to the (small bits of) code that I ended up writing in the process to help out.

Attempt 1: The Star-Light Approach

The first approach is to use Dean Edwards’ star-light JavaScript/CSS library. This is a nice piece of client-side JavaScript and CSS that syntax highlights your source code for you, in a variety of languages. All you have to do is enclose it in a tag like <pre class="javascript"> or <pre class="php"> . The CSS file (which you also need to reference from your page) takes care of invoking some JavaScript code that scours your page and does the syntax highlighting on the client.

Unfortunately, Dean Edward’s code doesn’t do Haskell by default. So I wrote the code to do it, and you can download a modified version of star-light here. It wasn’t terribly difficult, but neither is my code very elegant. It mostly works, though. Limitations that I know about are listed in the comments at the beginning of star-hs.js, and include poor handling of nested comments, “gaps” in string literals, and non-ASCII characters, and some overly excessive highlighting of Prelude symbols and the pseudo-keywords as , qualified , and hiding when they should be left alone.

Star-Light: Advantages

Very little effort on your part. Support for several different languages. Client-side programming is all the rage!

Star-Light Disadvantages

Uses non-standard CSS extensions, so limited portability. You have to trust my code. :) The parser is based on regular expressions, so it’s ultimately hopeless for Haskell.

In the end, though, neither of these is the reason that I didn’t end up using star-light for syntax highlighting. I didn’t use it because the blogging host I’m using, WordPress, strips any “dangerous” stuff out of CSS and bans JavaScript from their blogs. They claim it’s a security risk for them. I’m a little worried, and you should be too, that a very popular blog platform believes that their security validly depends on client-side behavior. I’d be even more worried if I thought they might be right. Ah well; given the chance to choose again, I’d have avoided WordPress for a blog, but it’s a little late for that now.

Off to my second attempt…

Attempt 2: HsColour + Plugin

Having given up on client-side syntax highlighting, I turned to the obvious choice for server-side implementation: HsColour. This nifty little project specializes in syntax highlighting Haskell, and can output HTML. If only it were a little easier to use when writing a blog entry…

I write my blog entries using Windows Live Writer. While I’m not at all addicted to it, it’s a nice alternative to the web-based editors in conventional blogs, and it doesn’t have the tendency to freeze up randomly that characterizes WordPress’s built-in editor. (Did I mention that I’d never start a blog with WordPress again? I thought so.) It also has a plug-in interface, so I built a simple plug-in that asks for a block of code, runs it through HsColour, and sticks the result into a <pre> block in the blog. Wanting to do the same for inline code fragments, I added a second plugin to ask for a line of code and put it into HTML <code> tags.

The plugin can be found here as a DLL file, which you can just drop into the plugins directory underneath Windows Live Writer’s installation directory to install it. You’ll need HsColour installed and in your path. If you want the source code, look here. This plugin calls HsColour with the CSS option, so you’ll need to add a CSS stylesheet to define your syntax highlighting styles. Alternatively, you could edit the plugin to use -html instead. (WordPress, for example, charges extra for writing a CSS file even with their “security” limitations; did I mention I’d never start another blog with them?) If you choose CSS, you’ll need styles for the selectors .keyword , .keyglyph , .layout , .comment , .conid , .varid , .conop , .varop , .str , .chr , and .num . The only confusing one is .layout , which I initially assumed had something to do with omitting squiggly braces. It turns out it’s just more reserved symbols and should probably be set to the same thing as .keyglyph .

HsColour + Plugin: Advantages

Result works on all browsers with basic CSS support. The planet.haskell.org server has (boring) HsColour CSS entries, so it works there! HsColour is probably better at correctly parsing the language.

HsColour + Plugin: Disadvantages

Only Haskell works; other techniques needed for other languages. My plugin only works on this one piece of blog writing software.

Attempt 2.5: Another Loose End

Another annoying thing about WordPress is that their blog software converts normal old everyday quote characters to “smart quotes”. This is merely annoying for regular text, but it’s absolutely fatal for source code. (Did I mention I’d never start another blog with WordPress?) One more quick change produces a half-fix for this. You should only need this if you are using WordPress as well. The idea is to hide quotes from the WordPress code-mangler by writing unnecessary HTML entities. Adding the appropriate HTML escape codes (which WordPress won’t let me write, but are an ampersand, followed by #, followed by either 39 or 34, followed by a semicolon) at line 79 of HsColour’s HTML.hs does the trick.

This sort of fixes the problem. Unfortunately, Windows Live Writer helpfully notices that these entity tags are not needed, and converts them back to spaces for me every once in a while. For the time being, my strategy for solving this is to be vigilant. If you notice a problem with smart quotes in source code in my blog, just say so and I’ll try to fix it.

That’s all I’ve got. Hope it was helpful.