For my parents’ 26th anniversary, I decided to convert an online religious text they read into a beautiful, well-typeset book.

The online text was built by volunteers using an archaic version of Microsoft Word and looks like this:

Anyone who has read science or math literature is exposed to the high-quality output LaTeX produces.

Fortunately LaTeX’s abilities extend far beyond the domain of mathematical symbols.

I was able to combine Clojure’s excellent HTML processing infrastructure (enlive) and LaTeX to produce a nice looking document.

The entire process took a few hours.

Here are two pages from the final output:

This blog post contains latex and clojure snippets to produce that output. I am not good at designing books or combining typefaces and would appreciate advice.

The LaTeX Pieces

The inspiration for this book came from this TeX StackExchange thread.

A user was working on replicating a 16th century bible (img from LaTeX Stack Exchange):

Using that piece as inspiration, I converged on the following theme:

A garamond typeface - I think they fit the theme of religious texts quite well. Fortunately a nice package ebgaramond makes it easy to typeset your entire document in this beautiful font.

1 \usepackage { ebgaramond }

is all you need to put in your LaTeX document.

Liberal use of ornaments on page-borders, special pages etc.

The pgfornament package comes with very beautiful ornaments. When combined with TikZ, a seasoned user can create very sophisticated and professional documents.

I am not a seasoned user so I was perfect satisfied with using something out-of-the-box. Each page in the book was going to have these ornaments in the page-corners:

The pgfornaments package combined with eso-pic allows you to achieve exactly that.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 \makeatletter \AddToShipoutPicture { % \begingroup \setlength { \@ tempdima }{ 2mm } % \setlength { \@ tempdimb }{ \paperwidth - \@ tempdima-2cm } % \setlength { \@ tempdimc }{ \paperheight - \@ tempdima } % \put ( \LenToUnit { \@ tempdima } , \LenToUnit { \@ tempdimc } ) { % \pgfornament [anchor=north west,width=2cm] { 63 }} \put ( \LenToUnit { \@ tempdima } , \LenToUnit { \@ tempdima } ) { % \pgfornament [anchor=south west,width=2cm,symmetry=h] { 63 }} \put ( \LenToUnit { \@ tempdimb } , \LenToUnit { \@ tempdimc } ) { % \pgfornament [anchor=north east,width=2cm,symmetry=v] { 63 }} \put ( \LenToUnit { \@ tempdimb } , \LenToUnit { \@ tempdima } ) { % \pgfornament [anchor=south east,width=2cm,symmetry=c] { 63 }} \endgroup } \makeatother

Next, I decided that each chapter would begin at a new-page.

Chapter numbers and subtitles (if any) would be adorned above and below with ornaments. Essentially I was going for:

Note that the ornaments in the corner are the result of eso-pic.

The borders in the north, south, east and west, and the styling around the chapter title are accomplished by:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

ewpage

ewgeometry { left=0cm,bottom=0cm,top=0cm,right=0cm } \begin { tikzpicture } [remember picture, overlay]

ode [anchor=north] at (current page.north) { \pgfornament [width=6cm,symmetry=h] { 46 }} ;

ode [anchor=south] at (current page.south) { \pgfornament [width=6cm] { 46 }} ;

ode [anchor=north,rotate=90] at (current page.west) { \pgfornament [width=6cm,symmetry=h] { 46 }} ;

ode [anchor=north,rotate=-90] at (current page.east) { \pgfornament [width=6cm,symmetry=h] { 46 }} ;

ode [inner sep=6pt] (chapter) at (current page.center) { \Huge Chapter I } ;

ode [inner sep=12pt, below of=chapter, text width=10cm, align=center, outer sep=12pt] (title1) { } ;

ode [inner sep=12pt, below of=title1, text width=10cm, align=center, outer sep=12pt] (title) { Salutations -- The Story of Grinding Wheat and Its Philosophical Significance } ;

ode [anchor=north] at (title.south) { \pgfornament [width=5cm] { 60 }} ;

ode [anchor=south] at (chapter.north) { \pgfornament [width=5cm,symmetry=h] { 49 }} ; \end { tikzpicture }

ewpage \restoregeometry

This forms the template for the book. Next, we populate the contents.

The Clojure Pieces

Enlive is a fantastic HTML parsing library for clojure. The hierarchical structure of HTML is captured in a clojure map:

To transform a single chapter, we traverse this map (tree) and transform the text as is appropriate. This is governed by where in the document the text occurs.

After manually inspecting a few chapters, I made a small table that mapped root - leaf paths in the DOM to handlers that would transform the text.

In clojure this can be succintly described as so:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 ( defn convert-chapter-parse ([ a-map ] ( ->> ( convert-chapter-parse a-map []) flatten ( apply str ))) ([ a-map parent-path ] ( let [ current-node-path ( conj parent-path ( if ( = ( :tag a-map ) :body ) [( :tag a-map )] [( :tag a-map ) ( :attrs a-map )])) node-contents ( :content a-map )] ( map ( fn [ an-item ] ( if ( map? an-item ) ( convert-chapter-parse an-item current-node-path ) ( let [ fixed-item ( -> an-item ( StringEscapeUtils/unescapeHtml3 ) ( string/replace # "\&" ( Matcher/quoteReplacement "\\&" )))] ( format-content fixed-item current-node-path )))) node-contents ))))

Essentially you keep track of where you are in the tree (relative to the root element) and then fetch a function from a table that transforms your text appropriately.

The table itself looks like this:

1 2 3 4 5 6 7 8 9 [[ :body ] [ :p nil ]] identity [[ :body ] [ :p nil ] [ :font { :size 5 }]] identity [[ :body ] [ :p nil ] [ :b nil ]] ( fn [ text ] ( str "\\section*{" text "}" ))

Simple.

Run this on the entire book and I managed to have a neatly typeset book hosted here.

Remarks

While a seemingly simple exercise (under 100 lines of code), html allows you to get the same output with different templating. I noticed that converting 10 chapters at a time and inspecting the batch for quirks was a better approach for measuring coverage.

LaTeX isn’t particularly fond of how HTML, MS Word etc use / handle double quotes, apostrophes and so on. I have a couple of string/replace functions but it clearly wasn’t enough to deal with the entire book. This is a problem that can only be solved by actually reading the book.

Overall, this turned out to be a really appreciated gift.