For those who like (to argue about) semantics, HTML 5 is fantastic. Old presentational elements now have new semantic meanings, there’s a slew of new semantic elements for us to argue about, and we've even in <cite> d a riot or two. But that's not all! Also in HTML 5 is microdata, a new lightweight semantic meta-syntax. Using attributes, we can define nestable groups of name-value pairs of data, called microdata, which are generally based on the page’s content. It gives us a whole new way to add extra semantic information and extend HTML 5.

For this purpose, authors can use the microdata features described in this section. Microdata allows nested groups of name-value pairs to be added to documents, in parallel with the existing content.

Sometimes, it is desirable to annotate content with specific machine-readable labels, e.g. to allow generic scripts to provide services that are customised to the page, or to enable content from a variety of cooperating authors to be processed by a single script in a consistent manner.

Instead of elements, these name-value pairs are defined via attributes:

Let’s go through these new attributes and see how to use them in practice with everyone’s favourite example band, Salter Cane.

This example uses a theoretical example from the spec, as schema.org Book vocabulary currently only defines ISBN as an itemprop , although they’ve mentioned plans to add itemid global identifiers to their vocabularies in the future. Global identifiers are defined for the WHATWG vocabularies for vCard and vEvent, with values like UID:19950401-080045-40000F192713-0052 and UID:19970901T130000Z-123401@host.com respectively .

This defines an item containing information about a book identified by the ISBN number 0321687299, as long as the http://vocab.example.com/book vocabulary defines global identifiers like this.

Sometimes an item may be identified by a unique identifier, such as a book by its ISBN number. This can be done in microdata using a global identifier via the attribute itemid="" , if specified by the vocabulary. itemid can only appear on an element with both itemscope and itemtype="" , and must be a valid URL.

This allows you to use multiple vocabularies in the same code snippet, even if they use the same property names.

Alternatively, you can use URL s for itemprop names. In this case, there’s no need to use itemtype as the vocabulary information is already contained in the name. These are referred to as globally unique names. While vocabulary-based names must be used inside a typed item to have the vocabulary-defined meaning, you can use a URL itemprop name anywhere.

This example defines the property url with the value http://saltercane.com/ and the property name with the value Salter Cane according to the http://schema.org/MusicGroup vocabulary ( MusicGroup is a specialised kind of Organization vocabulary on schema.org).

We can tie an item to a microdata vocabulary by giving it a type, specified via the attribute itemtype="" on an element with itemscope . The itemtype="" value is a URL representing the microdata vocabulary. Note that this URL is only a text string that acts as unique vocabulary identifier — it doesn’t actually need to link to an actual webpage (although it’s nice when it does). After doing this, we can use names in the vocabulary as itemprop names to apply vocabulary-defined semantics.

This defines the properties url , name , and date . Additionally, it references the ID band-members , which contains the item members with four name properties, each of which have a different value.

Items can use non-descendant properties (name-value pairs that aren’t children of the itemscope element) via the attribute itemref="" . This attribute is a list of the IDs of properties or nested items elsewhere on the page.

This defines the properties guitar and vocals , both of which have the value Chris Askew .

One element can also have multiple properties (multiple itemprop="" names separated by spaces) with the same value:

This defines the property name with four values, Chris Askew , Jeremy Keith , Jessica Spengler and Jamie Freeman .

Items can have multiple properties with the same name and different values:

Items that aren’t part of other items (i.e., anything with itemscope but not itemprop , or the child of an element with itemprop ) are called top-level microdata items. The microdata API returns top-level microdata items and their properties, which includes nested items.

This defines an item with two properties, name and members . The name is Salter Cane , and the members is a nested item, containing the property name with the value Jamie Freeman . Note that members doesn’t have a text value.

We can make a property into a nested item by adding itemscope to an element with itemprop .

This defines an item with three properties: the url is http://www.saltercane.com/ , the name is Salter Cane , and the date is 2010-07-18 .

It's still possible to use the text of one of these elements as its value — e.g., <a href="">desired value</a> . We just need to add an additional itemprop :

Conversely, the URL-containing attributes of these HTML 5 elements are not used as property values:

Similarly, the <time> element’s value is 2010-07-18 and not its text content (i.e., “next week”).

Note that the link’s itemprop="url" value is http://www.saltercane.com/ and not the element’s “Salter Cane” text content. In microdata, the following elements contribute their URL s as values:

This defines an item with the properties url and date containing the values http://www.saltercane.com/ and 2010-07-18 , respectively.

For some elements, an itemprop ’s value comes from an attribute of the element, not the element’s text. This applies to values from attributes containing URL s, the datetime attribute, and the content attribute.

itemprop names can be words or URL strings. Using URL s makes the name globally unique. If you use words, it’s best to use a vocabulary and the names defined in the vocabulary, which also makes the names unique. We cover this in the section Typed items and globally unique names .

The presence of itemscope on the <p> element makes it into a microdata item. The attribute itemprop on a descendent element defines a property of this item (in this case, name ) and associates it with the value Salter Cane (the <span> ’s content). An item must have at least one itemprop to be valid.

Microdata in action #

So now that we know how, why would we want to use microdata?

One use is adding extra semantics or data that we can manipulate via JavaScript in a similar way to custom data attributes ( data-* ). But if we use a vocabulary via itemtype or URL -based itemprop names, microdata becomes considerably more powerful.

While microdata is machine-readable without needing to know the vocabulary, using a vocabulary means others can know what our properties mean. This allows the data to take on a life of its own. Say what? Well, in effect, using a vocabulary makes microdata a lightweight API for your content.

If you visited someone’s homepage, wouldn’t it be great if you could add their contact information to your address book automatically? The same is true for adding an event you’re attending to your calendar. As the syntax examples were a bit example-y, let’s see how to do that using a real-world example — an upcoming event I’m organising (well, it was upcoming!):

<section> <h3><a href="http://atnd.org/events/5181" title="WDE-ex Vol11『iPad のウェブデザイン：私たちがみつけたこと 』 : ATND">WDE-ex Vol.11 — Designing for iPad: Our experience so far</a></h3> <p>On <time datetime="2010-07-21T19:00:00+09:00">July 21st 19:00 </time>-<time datetime="2010-07-21T20:00:00+09:00">20:00</time> at <a href="http://www.apple.com/jp/retail/ginza/map/">Apple Ginza</a>, <a href="http://informationarchitects.jp/" title="iA">Oliver Reichenstein, CEO of iA</a>, will share the lessons they’ve learned while creating three iPad apps and one iPad website.</p> </section> WDE-ex Vol.11 — Designing for iPad: Our experience so far On July 21st 19:00 - 20:00 at Apple Ginza, Oliver Reichenstein, CEO of iA, will share the lessons they’ve learned while creating three iPad apps and one iPad website. A Web Directions East event — in code and displayed

Now we could start making up our own itemprop names on an ad-hoc basis, but this effectively prevents anyone else from using our data. By using a vocabulary and following its rules, others can also use our data. It’s a good idea to use a vocabulary, so where do we find one?

Using Google’s Rich Snippets vocabularies # Google also has some basic vocabularies (precursors of schema.org vocabularies) for the following kinds of data, under the moniker of Rich Snippets: people

businesses and organizations

events

products

reviews

recipes These vocabularies support microformats and RDFa , two other ways to add extra semantics to our content, in addition to microdata. Apart from this difference, they’re basically identical to the matching schema.org vocabularies, except they use www.data-vocabulary.org instead of schema.org in the itemtype . While Google still supports them, the newer schema.org offers more vocabularies that are also supported by Bing and Yahoo, so choose schema.org vocabularies as long as you’re happy with microdata. You might still want to check out the Rich Snippets documentation, as it includes code samples and is generally better than schema.org’s at the time of writing.

Using WHATWG /microformats.org vocabularies # If you’re familiar with microformats or want more properties than Google’s vocabularies, the WHATWG HTML 5 specification actually contains microdata vocabularies for both the vCard and vEvent specifications that hCard and hCalendar are based on, plus a licensing vocabulary. Let’s take our earlier example and rewrite it using these vocabularies instead: <section itemscope itemtype="http://microformats.org/profile/hcalendar#vevent"> <h3><a itemprop="url" href="http://atnd.org/events/5181" title="WDE-ex Vol11『iPad のウェブデザイン：私たちがみつけたこと 』 : ATND"><span itemprop="summary">WDE-ex Vol.11 — Designing for iPad: Our experience so far</span></a></h3> <p itemprop="description">On <time itemprop="dtstart" datetime="2010-07-21T19:00:00+09:00">July 21st 19:00</time>- <time itemprop="dtend" datetime="2010-07-21T20:00:00+09:00">20:00</time> at <span itemprop="location" itemscope itemtype="http://microformats.org/profile/hcard"><a itemprop="url" href="http://www.apple.com/jp/retail/ginza/map/"><span itemprop="fn org"> Apple Ginza</span></a></span>, <span itemscope itemtype="http://microformats.org/profile/hcard"><a itemprop="url" href="http://informationarchitects.jp/" title="iA"><span itemprop="fn"> Oliver Reichenstein</span>, CEO of iA</a></span>, will share the lessons they’ve learned while creating three iPad apps and one iPad website.</p> </section> An HTML code sample with microdata describing an event, using vCard- and vEvent-based vocabularies (ref: standalone file) Currently, search engines don’t map these vocabularies to schema.org ones. It’s possible they will at some stage, so decide which vocabularies to use based on what information you want to mark up, as the data is accessible regardless. Criticism on microformats.org # Despite these vocabularies being based on vCard and vEvent and using microformats.org as their URL s, the microformats wiki actually warns against using the vCard and vEvent microdata vocabularies, stating: For common semantics on the web … microformats are still simpler and easier than microdata, and are already well implemented across numerous services and tools. Microdata — Microformats wiki Personally, I think the difference is marginal. If you use the recommended microformat profile links, I’d say it’s a wash. (But of course no one does ;). Microdata is actually simpler to use for date/time data than the microformat equivalents (although it is less permissive for fuzzy or ancient antiquity times), and it's more explicit, for example, avoiding the internationalisation issues of the “implied fn optimisation”. Tool support is a valid concern, but again I expect this to change over time — microdata is relatively new after all.