DBpedia Schema Queries¶

In this notebook, I begin the process of analyzing the schema of the DBpedia Ontology. This is a local notebook in which I load data from the filesystem into an in-memory graph, thus it is part of the unit tests for gastrodon. This is feasible because the schema is much smaller than DBpedia as a whole.

The following diagram illustrates the relationships between DBpedia Ontology, it's parts, DBpedia, and the world it describes

The numbers above are really rough (by as much as 30 orders of magnitude)), but some key points are:

The DBpedia Ontology has it's own Ontology, which is a subset of RDFS, OWL, Dublin Core, Prov and similar vocabularies

The DBpedia Ontology is much smaller (thousands of times) than DBpedia itself

DBpedia does not directly describe the external universe, but instead describes Wikipedia, which itself describes the universe.

It's important to keep these layers straight, because in this notebook, we are looking at a description of the vocabulary used in DBpedia that uses RDFS, OWL, etc. vocabulary. RDF is unusual among data representations in that schemas in RDF are themselves written in RDF, and can be joined together with the data they describe. In this case, however, I've separated out a small amount of schema data that I intend to use to control operations against a larger database, much like the program of a numerically controlled machine tool or the punched cards that control a Jacquard Loom.

This notebook is part of the test suite for the Gastrodon framework, and a number of bugs were squashed and API improvements made in the process of creating it. It will be of use to anyone who wants to better understand RDF, SPARQL, DBPedia, Pandas, and how to put it all together with Gastrodon.

As always, I import names from the Python packages I use: