The English version of this specification is the only normative version. Non-normative translations may also be available.

This document is also available in this non-normative format: diff to previous version

Please check the errata for any errors or issues reported since publication.

This specification defines rules and guidelines for adapting the RDFa Core 1.1 and RDFa Lite 1.1 specifications for use in HTML5 and XHTML5. The rules defined in this specification not only apply to HTML5 documents in non-XML and XML mode, but also to HTML4 and XHTML documents interpreted through the HTML5 parsing rules.

This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C 's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.

This document was published by the RDFa Working Group as a Recommendation. If you wish to make comments regarding this document, please send them to public-rdfa-wg@w3.org ( subscribe , archives ). All comments are welcome.

A sample test harness is available for software developers. This set of tests is not intended to be exhaustive. A community-maintained website contains more information on further reading, developer tools, and software libraries that can be used to extract and process RDFa data from web documents.

The specification makes use of the rdf:HTML datatype . This feature is non-normative, because the equality of the literal values depend on DOM4 [ dom4 ], a specification that has not yet reached W3C Recommendation status. See the relevant RDF 1.1 specification [ rdf11-concepts ] for further details.

This specification is an extension to the HTML5 language. All normative content in the HTML5 specification, unless specifically overridden by this specification, is intended to be the basis for this specification.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Today's web is built predominantly for human readers. Even as machine-readable data begins to permeate the web, it is typically distributed in a separate file, with a separate format, and very limited correspondence between the human and machine versions. As a result, web browsers can provide only minimal assistance to humans in parsing and processing web pages: browsers only see presentation information. RDFa is intended to solve the problem of marking up machine-readable data in HTML documents. RDFa provides a set of HTML attributes to augment visual data with machine-readable hints. Using RDFa, authors may turn their existing human-visible text and links into machine-readable data without repeating content.

The user agent conformance criteria are listed below, all of which are mandatory:

A user agent is considered to be a type of RDFa processor when the user agent stores or processes RDFa attributes and their values. The reason there are separate RDFa Processor Conformance and a User Agent Conformance sections is because one can be a valid HTML5 RDFa processor but not a valid HTML5 user agent (for example, by only providing a very small subset of rendering functionality).

The RDFa processor conformance criteria are listed below, all of which are mandatory:

XML mode XHTML5+RDFa 1.1 documents SHOULD be labeled with the Internet Media Type application/xhtml+xml as defined in section 12.3 of the HTML5 specification [ html5 ], MUST NOT use a DOCTYPE declaration for XHTML+RDFa 1.0 or XHTML+RDFa 1.1, and SHOULD NOT use the @version attribute.

Non-XML mode HTML+RDFa 1.1 documents SHOULD be labeled with the Internet Media Type text/html as defined in section 12.1 of the HTML5 specification [ html5 ].

An example of a conforming HTML+RDFa document, with the RDFa portions highlighted in green:

There are two types of document conformance criteria for HTML documents containing RDFa semantics; HTML+RDFa and HTML+RDFa Lite .

The key words MAY, MUST, MUST NOT, RECOMMENDED, SHOULD, and SHOULD NOT are to be interpreted as described in [ RFC2119 ].

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

3. Extensions to RDFa Core 1.1

The RDFa Core 1.1 [ rdfa-core ] specification is the base document on which this specification builds. RDFa Core 1.1 specifies the attributes and syntax, in Section 5: Attributes and Syntax, and processing model, in Section 7: Processing Model, for extracting RDF from a web document. This section specifies changes to the attributes and processing model defined in RDFa Core 1.1 in order to support extracting RDF from HTML documents.

The requirements and rules, as specified in RDFa Core and further extended in this document, apply to all HTML5 documents. An RDFa processor operating on both HTML and XHTML documents, specifically on their resulting DOMs or infosets, MUST apply these processing rules for HTML4, HTML5 and XHTML5 serializations, DOMs and/or infosets.

3.2 Modifying the Input Document RDFa's tree-based processing rules, outlined in Section 7.5: Sequence of the RDFa Core 1.1 specification [ rdfa-core ], allow an input document to be automatically corrected, cleaned-up, re-arranged, or modified in any way that is approved by the host language prior to processing. Element nesting issues in HTML documents SHOULD be corrected before the input document is translated into the DOM, a valid tree-based model, on which the RDFa processing rules will operate. Any mechanism that generates a data structure equivalent to the HTML5 or XHTML5 DOM, such as the html5lib library, MAY be used as the mechanism to construct the tree-based model provided as input to the RDFa processing rules.

3.3 Specifying the Language for a Literal According to RDFa Core 1.1 the current language MAY be specified by the host language. In order to conform to this specification, RDFa processors MUST use the mechanism described in The lang and xml:lang attributes section of the [ html5 ] specification to determine the language of a node. If the final encapsulating MIME type for an HTML fragment is not decided on while editing, it is RECOMMENDED that the author specify both @lang and @xml:lang where the value in both attributes is exactly the same. Note The HTML5 specification takes the Content-Language HTTP header into account when determining the language of an element. Some RDFa processor implementations, like those written in JavaScript, may not have access to this header and will be non-conforming in the edge case where the language is only specified in the Content-Language HTTP header. In these instances, RDFa document authors are urged to set the language in the document via the @lang attribute on the html element in order to ensure that the document is interpreted correctly across all RDFa processors.

3.4 Invalid XMLLiteral Values When generating literals of type XMLLiteral, the processor MUST ensure that the output XMLLiteral is a namespace well-formed XML fragment. A namespace well-formed XML fragment has the following properties: The XML fragment, when placed inside of a single root element, MUST validate as well-formed XML. The normative language that describes a well-formed XML document is specified in Section 2.1 "Well-Formed XML Documents" of the XML specification.

The XML fragment, when placed inside of a single root element, MUST retain all active namespace information. The currently active attributes declared using @xmlns and @xmlns: that are stored in the RDFa processor's current evaluation context in the IRI mappings MUST be preserved in the generated XMLLiteral. The PREFIX value for @xmlns:PREFIX MUST be entirely transformed into lower-case characters when preserving the value in the XMLLiteral. All active namespaces declared via @xmlns , @xmlns: , and @prefix MUST be placed in each top-level element in the generated XMLLiteral, taking care to not overwrite pre-existing namespace values. An RDFa processor that transforms the XML fragment MUST use the Coercing an HTML DOM into an infoset algorithm, as specified in the HTML5 specification, followed by the algorithm defined in the Serializing XHTML Fragments section of the HTML5 specification. If an error or exception occurs at any point during the transformation, the triple containing the XMLLiteral MUST NOT be generated. Transformation to a namespace well-formed XML fragment is required because an application that consumes XMLLiteral data expects that data to be a namespace well-formed XML fragment. The transformation requirement does not apply to plain text input data that are text-only, such as literals that contain a @datatype attribute with an empty value ( "" ), or input data that contain only text nodes. An example transformation demonstrating the preservation of namespace values is provided below. The → symbol is used to denote that the line is a continuation of the previous line and is included purely for the purposes of readability: Example 3 : Namespace preservation markup <p xmlns:ex="http://example.org/vocab#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> Two rectangles (the example markup for them are stored in a triple): <svg xmlns =" http://www.w3.org/2000/svg " property="ex:markup" datatype="rdf:XMLLiteral"> →<rect width="300" height="100" style="fill:rgb(0,0,255);stroke-width:1; stroke:rgb(0,0,0)"/> →<rect width="50" height="50" style="fill:rgb(255,0,0);stroke-width:2;stroke:rgb(0,0,0)"/></svg> </p> The markup above SHOULD produce the following triple, which preserves the xmlns declaration in the markup by injecting the @xmlns attribute in the rect elements: Example 4 : Namespace preservation triple <> <http://example.org/vocab#markup> """<rect xmlns="http://www.w3.org/2000/svg" width="300" →height="100" style="fill:rgb(0,0,255);stroke-width:1; stroke:rgb(0,0,0)"/> →<rect xmlns="http://www.w3.org/2000/svg" width="50" →height="50" style="fill:rgb(255,0,0);stroke-width:2; →stroke:rgb(0,0,0)"/>"""^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral> . Since the ex and rdf namespaces are not used in either rect element, they are not preserved in the XMLLiteral. Similarly, compound document elements that reside in different namespaces must have their namespace declarations preserved: Example 5 : Namespace preservation for compound document elements <p xmlns:ex="http://example.org/vocab#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:fb="http://www.facebook.com/2008/fbml" > This is how you markup a user in FBML: <span property="ex:markup" datatype="rdf:XMLLiteral"> →<span><fb:user uid="12345">The User</fb:user></span> →</span> </p> The markup above SHOULD produce the following triple, which preserves the fb namespace in the corresponding triple: Example 6 : Namespace element preservation triple <> <http://example.org/vocab#markup> """<span xmlns:fb="http://www.facebook.com/2008/fbml" > →<fb:user uid="12345"></fb:user> →</span>"""^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral> .