Courtesy the W3C

You might have read that, on October 28th, W3C officially recommended HTML5. And you might know that this has something to do with apps and the Web. The question is: Does this concern you?

The answer, at least for citizens of the Internet, is yes: it is worth understanding both what HTML5 is and who controls the W3C. And it is worth knowing a little bit about the mysterious, conflict-driven cultural process whereby HTML5 became a “recommendation.” Billions of humans will use the Web over the next decade, yet not many of those people are in a position to define what is “the Web” and what isn’t. The W3C is in that position. So who is in this cabal? What is it up to? Who writes the checks?

The Web is a Millennial. It was first proposed twenty-five years ago, in 1989. Six years later, Netscape’s I.P.O. kicked off the Silicon Valley circus. When the Web was brand new, many computer-savvy people despised it—compared to other hypertext-publishing systems, it was a primitive technology. For example, you could link from your Web page to any other page, but you couldn’t know when someone linked to your Web page. Nor did the Web allow you to edit pages in your browser. To élite hypertext thinkers and programmers, these were serious flaws.

The Web was, however, very easy to set up and learn. It contained the seeds of its own transmission—anyone could learn HyperText Markup Language by reading a Web page then viewing the raw HTML beneath. The Web was made up of simple documents and images that linked to other simple documents and images.

The religion of technology is featurism, however, and, so, people began adding everything they could to the Web. How about displaying things in 3-D? How about text that blinks or text that scrolls across the page as a marquee? What about turning every single Web page into software? Different browsers—with names like Mosaic, Netscape, Internet Explorer, Cyberdog, Spyglass, Lynx, and Amaya—appeared, each carving out its own cultural and market niches.

With that complexity came Balkanization. Imagine that your Web browser only renders photographs in one format, and mine in another, and I send you a link to an image—you wouldn’t be able to see it. Instead of one Web, you would have many. Anarchy would ensue and photographers would complain and complain.

As this Balkanization was beginning to happen, people realized that there was a need for a group to decide on a common language that would include all the necessary features. Then that group would need to write a document that contained every aspect of the evolution of hypertext markup language. This is the standardization process—technical diplomacy in the interest of commerce—and it is essential to the progress of the Internet. It is also not original to computing.

Consider the Buffalo Convention, of 1908, when player-piano manufacturers met at the Iroquois Hotel in Buffalo. At issue was the number of perforations per inch that would be punched into the rolls used to map out songs for the pianos; some people favored nine, some favored eight, and the difference meant increased costs, manufacturer distress, and customer confusion. In “Gathering of the Player Men at Buffalo,” the Music Trade Review described a heady scene in which Mr. P. B. Klugh, speaking for the Cable Company, said that it had adopted “the nine-to-the-inch scale” and that “they were not open to argument on the subject, as such a scale had given entire satisfaction.” Swayed, the manufacturers resolved the issue in favor of Klugh. As a result, we now live in a world where nine-holes-per-inch piano rolls are the standard. You would be a fool to build a player piano to any other metric.

Of course, the Web page is far more complex. It requires dozens of standards, governing words, sounds, pictures, interactions, protocols, code, and more. The role of Web parliament is played by the W3C, the World Wide Web Consortium. This is a standards body; it organizes meetings that allow competing groups to define standards, shepherding them from a “working draft” to “candidate recommendation” and “proposed recommendation,” and finally, if a standard has been sufficiently poked and prodded, granting the ultimate imprimatur, “W3C recommendation.”

The W3C has been meeting for twenty years, led by its director, Tim Berners-Lee, the principal creator of the Web. Its membership is drawn from close to four hundred academic, not-for-profit, and corporate organizations. Among its most engaged participants are large companies that build Web software and host enormous Web sites—ones like Google, Microsoft, and Facebook. They all pay dues for spots at the table—sixty-eight thousand five hundred dollars a year for the biggest U.S. firms, although not-for-profits and smaller firms pay far less, and less-prosperous nations adhere to a sliding scale.

The cultural mission of the W3C is to make the Web “available to all people, whatever their hardware, software, network infrastructure, native language, culture, geographical location, or physical or mental ability.” The way it accomplishes this is by committee, via standards documents.

If you want news about the development of the Web, you can visit the W3C home page and scan the most recent news. Reading through the standards, which are dry as can be, you might imagine that standardization is a polite, almost academic process, where wonks calmly debate topics like semicolon placement. This is not the case. Important standards are sometimes forged in polite discourse, and sometimes in a crucible of tribal rage, leaving behind a trail of open letters, back-channel sniping, and high-dudgeon blog posts.

This is not some secret shame; it is an expected part of a healthy process. “Technology standardization is commercial diplomacy,” wrote Stephen R. Walli, a business-strategy director at Hewlett-Packard and a veteran of many such efforts, in a paper on the subject, “and the purpose of individual players (as with all diplomats) is to expand one’s area of economic influence while defending sovereign territory.”* Or, as Charles F. Goldfarb—who co-created a forerunner to HTML called Standard Generalized Markup Language, in 1974—once delicately put it, on an e-mail list: “Multi-year projects in a highly political arena with changing personnel contributes to a loss of focus.” Which is to say: standards, like laws, emerge from fundamental conflict.

Since its first iteration, HTML has defined a set of rules for adding markup to textual content. If you wanted something to be a headline, you’d add <h1> tags around it: <h1>Your Headline</h1>. The <h1> is the markup. The “Your Headline” is just character data. Your browser, programmed to interpret the rules of HTML, would show it in an appropriately large format.

That’s HTML at its essence: just a bunch of tags. But, with HTML5, the markup language has become a connective tissue that holds together a host of other technologies. Audio, video, pictures, words, headlines, citations, open-ended canvases, 3-D graphics, e-mail addresses–it lets you say that these things exist and gives the means to pull them into one solitary page. You can even “validate” a page. At this writing, for example, Apple.com has one HTML5 error. That’s pretty good: the New York Times has a hundred and forty-one.

Validity, in this scenario, is an ideological construct. The promise is that by hewing to the rules put forth by the W3C, your site will be accessible to more people than would a less valid page. Both pages work fine for most people; browsers are tolerant of all sorts of folderol. The ultimate function of any standards body is epistemological; given an enormous range of opinions, it must identify some of them as beliefs. The automatic validator is an encoded belief system. Not every Web site offers valid HTML, just as not every Catholic eschews pre-marital sex. The percentage of pure and valid HTML on the web is probably the same as the percentage of Catholics who marry as virgins.