A guest blog entry from Jeff Young who does most of the RDF work for VIAF:

VIAF RDF has evolved over time and is about to change again to streamline the model. Using Barbara Tillett as an example, these before and after diagrams might give a sense of the changes in the next generation.

Should VIAF Treat Headings as Identifiable 1st Class Objects?

In the old model, authority headings were treated as 1st class entities (viaf:Heading). The http URI for each headings was coined by URL-encoding and appending it as a hash to the viaf:NameAuthorityCluster URI. For example:

<http://viaf.org/viaf/77390479> a viaf:NameAuthorityCluster

<http://viaf.org/viaf/77390479/#Tillett,+Barbara+B.> a viaf:Heading

<http://viaf.org/viaf/77390479/#Tillett,+Barbara+B.+1946-> a viaf:Heading

This treatment aligns with both FRSAD Nomen and skosxl:Label classes, which made this solution too tempting to pass up at the time. It also aligned with naïve notions of search engine optimization where semantically transparent information is URL-encoded in the resource identifier. In practice, though, VIAF doesn’t do anything useful with these entities, so they are being deleted for now in favor of simpler skos:prefLabel and skos:altLabel forms.

The Primary Entity

VIAF’s clustering of authority records tries to see past regional and variant naming differences to opaquely identify what UNIMARC Authority Format refers to as a “Primary Entity – The entity, named in the 2—block of the record, for which the record was created. Data in the 1—block generally pertain to characteristics of the primary entity.”

In this spirit, the most important change in the new RDF model involves the treatment of opaque URIs of the form http://viaf.org/viaf/77390479. (This resource is marked by a yellow box in both diagrams.) In the old model, this resource was specified as a viaf:NameAuthorityCluster and was characterized by the accumulation of labels. It also acted as a skos:exactMatch hub to authority records (skos:Concepts) from various sources (skos:ConceptSchemes). In effect, this entity acted as a vocabulary hub. As such it wasn’t very effective, though, because VIAF doesn’t choose or construct “prefLabel” winners from the set of contributed records. This could change in the future, though.

Regardless, VIAF URIs of the form http://viaf.org/viaf/77390479 are making a subtle but important shift from identifying clustered name(s) to identifying the primary entity itself. As a consequence, the relationship links to contributed authorities is also being changed from skos:exactMatch to foaf:focus.

Competing Models of Reality?

The old model also had the notion of “primary entity”, but its experimental description was split between two URI identifiers based on overlapping ontologies: FOAF and RDA. For example:

<http://viaf.org/viaf/77390479/#foaf:Person> a foaf:Person

<http://viaf.org/viaf/77390479/#rdaEnt:Person> an rdaEnt:Person

This separation was motivated by the desire to support the development and use of both Semantic Web (e.g. FOAF) and library (e.g. RDA) ontological models but in the absence of any formal mapping between the two. For example, would FOAF and RDA agree that Nicolas Bourbaki is a “Person”? For reference, DBpedia doesn’t currently think so, but is there enough systematic information in authority data to make these kinds of subtle distinctions? VIAF’s mining and update mechanisms can be improved, but it will require time and the development of more subtle models.

As described above, the new VIAF model helps avoid dependence on specific ontologies by using the opaque VIAF URI to identify the primary entity. (For now, at least, the old hash URI forms are still supported for OWL inferencing purposes via owl:sameAs assertions.) Model bias is further avoided in the RDF/XML by using rdf:Description rather than striped RDF to specify the rdf:type(s). These changes should make links to VIAF resources more reliable and changes less disruptive as the model continues to evolve.

Jeffrey A. Young

Software Architect

OCLC Research