Making the Ugly Elegant: Templating With DOM

Update v16: Much better handling of HTML/XML-input and output

Templating is easy to do in any particular way, but doing it right is hard. I can’t count how many hip new template engines have popped up in just the last few years alone. I’m about to add one to the pile, but it is certainly not ‘hip’. It is however the closest I have ever gotten to the fabled golden fleece of “100% separation”. Unlike most other forms of templating, this really doesn’t mix logic and HTML, nor does it try to mask the blatent logic (“if this, then this”) by renaming ‘logic’ or using a {{special syntax}} .

What we’re going to do is this: take a static (and I mean static) HTML page, load it into the DOM as an XML tree and then use the PHP as your logic, removing bits of the template not needed and changing the text about.

I got this idea from this blog post: Your templating engine sucks and everything you have ever written is spaghetti code (yes, you) . The article itself is long, agressive, rambling and fails to demonstrate the principle concretely. I simply ignored all the text and focused on the core principle that was being noted: instead of embedding some form of code in the HTML (even if it’s just evolved search/replace syntax), just load the HTML into DOM and minpulate there so that the HTML itself is ignorant of the templating.

The reason why this is not just the same as a {{special-syntax}} is that we are not mixing two different languages, syntaxes or programming models in one HTML file. If you change your templating engine, it’s still HTML. If you change your logic, it’s still HTML. Special syntaxes invent another language to intermix with HTML and thus add programatic concepts to a declartive syntax—which is not clean separation no matter what you name it.

Ever since the ’Web was invented there has been a transluscent, yet intransient divisor between those developers who understand the fundamental difference between a declerative markup syntax and a programming language, and those who don’t. Some learn to see this difference, others simply ignore it and believe that it is a swell idea to tie structured data to a structured program that will bit rot one thousand times quicker than the data will. If you are trying to replace HTML or CSS with JavaScript, you are doing it wrong and have just signed a maintenance contract from hell, with yourself, for yours and your data’s life. Kroc Camen—I Don’t Want to Do This Any More

By doing it this way, the HTML file itself can be designed independently of the software, and that whoever does the HTML doesn’t have to know PHP. You could change the whole server language and it wouldn’t change the template one bit. More importantly you can actually view the whole look of the template in the browser without running the software. The reason I’m adopting this templating approach for NoNonsense Forum is to make it easier for anybody to modify the look of their forum without having to learn PHP, and hopefully encourage more contribution from all skill levels.

It took a few revisions, two weeks and a lot of head-wracking to beat the DOM into something elegant, but here it is, NoNonsense Templating:

How It Works

The first thing to wrap your head around is that DOM templating works on the principle of mostly taking away rather than adding. Logic-wise this is more difficult to get used to than you would think; you will be used to adding data according to logic rather than “if this, then remove the thing that it is not”.

Firstly your template should be a static HTML page that contains all of the content and ‘possibilities’ of your output, where by we will remove what is not relevant to the page. For example:

<p id="login" class="logged-out"> You are not logged in. </p> <p id="login" class="logged-in"> You are logged in as <b class="username">Bob</b> </p>

In the PHP we can modify the HTML this way:

(Please note that templates you load must be valid XML and have a single root node—e.g. “ <html> ”—in order to work, the examples in this article omit this for simplicity. See XML caveats for more details)

//load the template and provide an interface $template = new DOMTemplate (file_get_contents ('test.html')); //lets imagine the user is logged in, remove the logged-out section and set the username $template->remove ('.logged-out'); $template->setValue ('.username', 'Alice');

The command “ remove ('.logged-out') ” finds all elements that have a class of “ logged-out ” and deletes them (You can also refer to IDs using ‘ #id ’).

The setValue method sets the text-content of an element, removing anything that was within. By replacing element content it means that you can provide dummy text to test the look and feel of your template, and it will be replaced with the real data. No more staring at {{NAME_GOES_HERE}} !

Behind the scenes “ .logged-out ” becomes the full XPath “ .//*[contains(@class,"logged-out")] ”. The shorthand syntax also supports specifying a required element type and/or an attribute to target, e.g:

$template->setValue ('a.my-button@href', '/some_url');

You can also use full XPath syntax:

//if using HTTPS, change the Google search box to use HTTPS too if (@$_SERVER['HTTPS'] == 'on') $template->setValue ( '//form[@action="http://google.com/search"]/@action', 'https://encrypted.google.com/search' );

Looping is always a sore point in templating. How do you take a chunk and repeat it down the page without having to define a ton of logic in your templates?

Looping with the DOM is shockingly elegant!

$item = $template->repeat ('.list-item'); foreach ($data as $value) { $item->setValue ('.item-name', $value); $item->next (); }

The repeat method takes an element (via shorthand/XPath) to be used as the repeating template and copies it, then you just set and remove elements from the repeating template as if it were its own template. Once you’ve templated that iteration you call the next method and the HTML is added after the previous element, then the template repeater resets itself back to the original HTML so you can template it again!

Once you’ve made all your changes to the template, just retrieve the final HTML and output.

die ($template);

See the API for details of all the functions.

The Code

If you would like to see a real-world use of this templating system with a ton of examples you can draw from real, practical code you can examine the source code of my forum system called NoNonsene Forum here:

If you don’t like the idea of targetting classes or IDs in your HTML, have a look at v4 of DOMTemplate that finds elements according to data-template attributes.

Caveats

Whitespace handling is good, but not perfect In the case of repeating an element the whitespace within is kept, but the whitespace outside the element is not. This is not a major problem, it just means that the closing and opening tags of your lists will be paired (e.g. “ …</li><li>… ”). The biggest issue is that when elements are removed, the whitespace around them remains, meaning that you get a number of blank lines in the output HTML where the elements used to be. There’s no direct way of handling this other than perhaps using a search/replace to remove blank lines in the HTML after it’s been templated. One benefit of using the DOM however is that if you want minify the HTML a little, you can just add “ $this->DOMDocument->preserveWhiteSpace = false; ” to the constructor function of DOMTemplate and the markup will be returned as a big blob with few line-breaks. If you add “ $this->DOMDocument->formatOutput = true; ” instead, the markup will be ‘tidied’ for you, re-nesting the elements neatly in an easy to read fashion. XML woes DOMTemplate stores and manipulates the template internally as strict XML. Thankfully, since v 16, DOMTemplate automatically converts your source HTML to XML on loading and converts from XML to HTML on output, thus alieviating most of the input-strictness problems with earlier versions. There is however still a few caveats to remember: HTML must be valid

The automatic conversion of HTML named-entities (invalid in XML) into Unicode is still not comprehensive. 248 of the most common are covered, but a total of over 2100 exist. DOMTemplate may in a future version cover all 2100+ named entities, but until then ensure that your HTML source does not use any named-entities outside of the 248 recognised by DOMTemplate

HTML that you load either through DOMTemplate or apply to the template using setValue must have only one root node. I.e. a list of elements can not be used unless wrapped by an element.

The API

Instantiation

Provide the HTML to load as a string when instantiating the template class. It must be valid and have only one root element (e.g. <html> ).

$template = new DOMTemplate (file_get_contents ('index.html'));

If you are loading an XHTML document, or any XML file with a default namespace (e.g. <html xmlns="http://www.w3.org/1999/xhtml"> ), you must specify a prefix (any will do) and the namespace URL like so:

$template = new DOMTemplate ('index.html', 'html', 'http://www.w3.org/1999/xhtml');

All XPath queries you make with this template must prefix element names with the namespace, including for the shorthand:

$template->setValue ('//html:title', 'Hello World'); //XPath $template->setValue ('html:a#my-button@href, 'http://google.co.uk'); //shorthand

This bizzare requirement is a limitation in the design of XPath itself.

Shorthand XPath Syntax

All of the methods that accept a query ( setValue , set , addClass , remove & repeat ) use a shorthand-syntax where you only need to provide the class (“ .class ”) or ID (“ #id ”) you want to target and the full XPath query is built for you. E.g. `.my-button`

An element type can be provided: `a#my-button`

An attribute name can be provided which will be the target of the setValue , set and remove methods: `a#my-button@href`

You can test attributes for values (the element will be selected, not the attribute):

`label@for="submit"`

You can specify the index of an element to select: `li[1]`

You can select child elements: `#list/li/a`

You can also just use full XPath query, as-is: `/html/head/title`

You can provide multiple targets by separating the queries with commas, e.g: `.header, .body, .footer`

You can intermix shorthand and full XPath like this.

(string) Output

To get the HTML out of the template, cast the template class object to a string, e.g. :

$template = new DOMTemplate ('<span>test</span>'); echo $template;

In instances where the intended type is ambiguous, use PHP’s casting syntax to force a string conversion:

$html = (string) $template;

repeat

repeat (string $query)

Takes a shorthand XPath query and returns a DOMTemplateRepeaterArray object instantiated with the element(s) selected in the query. This object supports the set , setValue , addClass & remove methods, in addition to the following method:

next

Takes the current HTML content of the elements within DOMTemplateRepeaterArray object and appends it as a sibling to the previously repeated template ( i.e either the element(s) you instantiated the repeater with, or the element(s) that were added by the previous call to the next method), then resets its HTML content back to the original HTML it had when it was created.

In simple terms, it adds the templated HTML to end of a list and then resets it back to the original HTML, to be used again. In practical terms, like this:

$item = $template->repeat ('.list-item'); foreach ($data as $value) { $item->setValue ('.item-name', $value); $item->next (); }

setValue

setValue (string $query, string $value, [bool $asHTML=false])

Replaces the content of all elements matched with the shorthand XPath query with the given value. The string value is HTML-encoded (unless you give `asHTML` as true), so any HTML in the value will appear as-is, rather than be rendered as HTML. This method intelligently sets the value to elements, attributes and classes according to the XPath used. See addClass for details on HTML class behaviour.

$template->setValue ('#name', 'Kroc');

set

set (array $queries, [bool $asHTML=false])

Allows you to write code in a more compact way by specifiying an array of shorthand XPath queries and their associated value to set.

$template->set (array ( '#name' => 'Kroc', '#site' => 'http://camendesign.com' ));

addClass

addClass (string $class)

Adds the specificed HTML class name to every element matched with the shorthand XPath query. If an element already has a class attribute, mutliple class names will be separated by spaces when the new class is added.

$template->addClass ('#section', 'open');

remove

remove (string $query | array $queries)

Deletes all the elements (and their children) matched with the shorthand XPath query.

$template->remove ('.secret-stuff');

Also accepts an array in the format of “ 'xpath' => true|false ”.

If the value is false, the XPath will be skipped. This allows you to write compact removal code by not having to write “ if (x) $template->remove ('y'); ” several times in a row, e.g:

$template->remove (array ( '.section-1' => $section == 1, '.section-2' => $section == 2, ⋮ ));

For a good example of this style of writing, see the code for NoNonsense Forum.

In addition to this behaviour, you can also remove classNames from a class attribute, whilst retaining any other class names present by specifying the className to remove in the value, when tragetting a class attribute with the XPath, thusly:

$template->remove (array ('a@class' => 'undesired'));

History