I mentioned in my previous post that I “had come away with my head reeling from the massive length and depth of the often-changing specification”, which is entirely true. Printouts of the current draft of the HTML5 spec can reach, depending on your operating system and installed fonts, somewhere north of 900 pages. Yes: nine hundred. There are unabridged Stephen King novels that run shorter.

You might well say to yourself: “Self, is it just me, or are the people doing this completely off their everlovin’ rockers? Because the specification for something as fundamentally simple as HTML should reach maybe 200 pages, max.” You might even despair that the entire enterprise is doomed to failure precisely because nobody sane will ever sit down to read that entire doorstop.

But there’s no real reason to panic, because here’s the thing about the HTML5 specification that might not be obvious right away: it’s not for you. It’s for implementors. And that’s a good thing.

If you do start reading the HTML5 draft, you’ll start running into really lengthy, excruciatingly detailed algorithms for, say, parsing a time component. Or moving through the browser’s history. Or submitting a form. There’s an entire (long) chapter on how to process the HTML syntax.

Those are all good things, actually. They greatly increase the chances of interoperability actually happening within our lifetimes. There’s no guessing about, well, much of anything. It’s all been exactingly defined, to the extent that one can exactingly define anything using a human language. A browser team doesn’t have to wonder, or even guess, what to do when the document has been completely parsed. It’s all spelled out. And the people on those browser teams will, in the end, be the people who read that entire doorstop. (Their sanity is another matter, and not discussed here.)

How is all that stuff relevant to you, the author? In the sense that when browser teams follow the spec, their products will be interoperable, which is to say consistent. (Just imagine that for a moment.)

Beyond that, though, the detailed implementation stuff isn’t relevant to you. You are not expected to know all those algorithms in order to write HTML documents. Pretty much all you need to know is the markup. That’s the part that should be no more than 200 pages, yeah?

Turns out it is, and by a comfortable margin. Michael(tm) Smith’s HTML5: The Markup Language is a version of the HTML5 draft with all of those eye-wateringly pedantic implementor sections stripped out, and when I generated a PDF it came in at 147 pages. That’s what you really need in order to get up to speed on what’s in HTML5. It’s for you.