I have already blogged on the concept of the RDFa 1.1 core default profile in a

blog I published a few months ago when the second

Last Call of the RDFa 1.1 draft was published. This default

profile is automatically included in any RDFa 1.1 content by an

RDFa 1.1 processor (conceptually, that is; a processor would

probably cache the content of this profile). The profile itself

defines prefixes for a number of RDF vocabularies, This means

that, for example, the following HTML+RDFa file:

<html> <body> <p about ="xsd:maxExclusive" rel="rdf:type" resource="owl:DatatypeProperty"> An OWL Axiom: "xsd:maxExclusive" is a Datatype Property in OWL. </p> </body> </html>

(note the missing prefix declarations!) will produce the RDF

triple that one might expect, i.e.,

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . xsd:maxExclusive rdf:type owl:DatatypeProperty .

The major question is, of course, (1) what vocabularies are

included into such a default profile and (2) how often would that

list change?

The answer to the second question is easier: a profile should not

change often. Once every 6 months, maybe, or even less frequently,

so that caching by the processors would be effective (some

processors may even choose to copy, verbatim, the prefix

definitions into their code). There is also a rule

in the conformance section of the RDFa 1.1 document stating that

the content of a profile can only grow; once a vocabulary is

included, it should stay there (otherwise existing content will

change, and that is not acceptable).

The content issue is more complicated. There again, there is a

set of vocabularies whose inclusion is fairly straightforward: all

vocabularies published as part of a W3C Recommendation or Note

should be part of the list. After all, these vocabularies have

undergone a rigorous and general review by the community, as

secured by the W3C process. (Ie, the example above should be

fine.) However, that cannot be all: the real advantage of having a

default profile is to include at least some of the widely used

vocabularies. (After all, the really interesting examples are not

like the one above but, rather, the general RDFa snippets like the

ones I described in another, separate

blog.) So here is the real question: what are the, say, 8-10

vocabularies, widely used on the Semantic Web, general in their

topic, and used in an RDFa application?

As already described in my

earlier blog, there are several approaches that one could

take. One is to go through some sort of a registration mechanism

like, for example, the W3C xpointer

registry. However, that would not necessarily reflect the

widespread usage of the vocabularies; after all, vocabulary owners

may not want to go through the extra step of the registration

process. After some discussions, the Working Group decided to try

a different route, namely to use information that search engines

can provide on vocabularies. And I am happy to report that this

has proven to be possible thanks to Péter Mika, from Yahoo!, and Giovanni Tummarello

and his friends, from the Sindice

team. Both search engines have performed a crawl over several

billions of triples, collected the vocabulary URI-s, sorted them;

finally, the top results were merged into a set of default

profile prefixes. We deliberately chose to be very

restrictive in the numbers, yielding 11 default prefixes beyond

the W3C ones. Indeed, one has to take into account that new

vocabularies will come up in future and, if they appear on the top

of the lists for new crawls in, say, a year from now they will be

added to the list. In other words, the list may grow; better stay

small at the beginning. Of course, there are a number of technical

details on how this list has been generated, how the crawl results

were processed, etc.; these are all documented

on the W3C site in case you are interested by the details.

So, if the list becomes final (we still anticipate comments and

feedbacks before freezing it), it is possible to write something

like:

<div typeof="v:Review"> <span property="v:itemreviewed">L’Amourita Pizza</span> Reviewed by <span property="v:reviewer">Ulysses Grant</span> on <span property="v:dtreviewed" content="2009-01-06">Jan 6</span>. <span property="v:summary">Delicious, tasty pizza on Eastlake!</span> <span property="v:description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span property="v:rating">4.5</span>.

Address: <span property="vcard:street-address">111 Lake Drive</span>, <span property="vcard:locality">WonderCity</span>, <span property="vcard:postal-code">5555</span>, <span property="vcard:country-name">Australia</span>.

<address

</div>



without the need to specify the Google snippet or the vcard

vocabularies; they are just there!

Of course, further thoughts, comments, etc, are very welcome!