I’m about to embark on a documentation project, and I’ve been thinking quite a bit about the tools I’d ideally like to use for the job.

Previously, when I’ve done a lot of writing, it’s been with LaTeX, or on rare, painful, occasion with a word processor or dedicated tech pubs tool like FrameMaker.

What I like about using a text-based markup language is the ability to use normal revision control tools to see the evolution of what I’ve been doing. Ignoring any issues surrounding whether word processors are more or less appropriate than feeding text markup into a compiler, trying to use revision control tools with opaque word processor files is pointless, as I get no information more useful than a blob; trying to use the internal change-tracking features of word processors is worse.

I have had something of a problem with my choice of markup language. My normal default would be to use LaTeX, but it’s long been difficult to produce good HTML from LaTeX; people have a reasonable expectation that what they’re reading will look attractive on the web. The standard tool for producing HTML from LaTeX was long latex2html, which produces atrocious, unnavigable HTML.

Believing that this was still the case, my fallback plan was to use DocBook, but I am finding the markup to be painfully obtrusive. Worse, the pipeline for producing HTML and PDF files is opaque to me; I crank a pile of stuff into Apache FOP, bits of formatted sausage come out the other end, and I don’t really know what happens in the middle. The typeset PDF files produced by FOP, at least using the style sheets I’ve borrowed from the Subversion Book, look clunky, which makes me unhappy. It had been looking like I could get either attractive PDF or HTML output with a particular choice of tool, but not both. At least, not unless I wanted to go down the tortured path of converting DocBook to LaTeX for PDFs.

Some determined Googling this evening yielded the Tex4ht package, which produces XHTML, uses CSS, and seems well integrated into the normal TeX environment. I also found some likely knobs that I can try twiddling with PDFTeX, to get it to use hyperlinks more liberally while still typesetting my document beautifully.

So the main advantage that DocBook held for me, the decent XHTML and CSS output I could produce for negligible effort, seems gone. I think I can return to the familiar free tools that I’ve been using for over a decade. What a relief!