If you haven’t been living under a rock for the last weeks you might have noticed a new release of the LOD cloud diagram with some 200 datasets and some 25 billion triples. Very impressive, one may think, but let’s not forget that publishing Linked Data is not an end in itself.

So, I thought, how can I do something useful with the data and I ended up with a demo app that utilizes LOD data in an enterprise setup: the DERI guide. Essentially, what it does is telling you where in the DERI building you find an expert for a certain topic. So, if you just have some 5min, have a look at the screen-cast:

Behind the curtain

Now, let’s take a deeper look how the app works. So, the objective was clear: create a Linked Data app using LOD data with a bunch of shell scripts. And here is what the DERI guide conceptually looks like:

I’m using three datasets in this demo:

All that is needed, then, is an RDF store (I chose 4Store; easy to set up and use, at least on MacOS) to manage the data locally and a bunch of shell scripts to query the data and format the result. The data in the local RDF store (after loading it from the datasets) typically looks like this:

The main script ( dg-find.sh ) takes a term (such as “Linked Data”) as an input, queries the store for units that are tagged with the topic (http://dbpedia.org/resource/Linked_Data), then pulls in information from the FOAF profiles of the matching members and eventually runs it through an XSLT to produce a HTML page that opens in the default browser:

clear echo "=== DERI guide v0.1" echo "Trying to find people for topic: "$1 topicURI=$( echo "http://dbpedia.org/resource/"$1 | sed 's/ /_/') curl -s --data-urlencode query="SELECT DISTINCT ?person WHERE { ?idperson <http://www.w3.org/2002/07/owl#sameAs> ?person ; <http://www.w3.org/ns/org#hasMembership> ?membership . ?membership <http://www.w3.org/ns/org#organization> ?org . ?org <http://www.w3.org/ns/org#purpose> <$topicURI> . }" http://localhost:8021/sparql/ > tmp/found-people.xml webids=$( xsltproc get-person-webid.xsl tmp/found-people.xml ) echo "<h2>Results for: $1</h2>" >> result.html echo "<div style='padding: 20px; width: 500px'>" >> result.html for webid in $webids do foaffile=$( util/getfoaflink.sh $webid ) echo "Checking <"$foaffile"> and found WebID <"$webid">" ./dg-initdata-person.sh $foaffile $webid ./dg-render-person.sh $webid $topicURI done echo "</div><div style='border-top: 1px solid #3e3e3e; padding: 5px'>Linked Data Research Centre, (c) 2010</div>" >> result.html rm tmp/found-people.xml util/room2roomsec.sh result.html result-final.html rm result.html open result-final.html

The result for the example query ./dg-find.sh "Linked Data" yields a HTML page such as this:

Lessons learned

I was amazed by the fact how easy and quick it was to use the data from different sources to build a shell-based app. Most of the time I spent writing the scripts (hey, I’m not a shell guru and reading the sed manual is not exactly fun) and tuning the XSLT to output some nice HTML. The actual data integration part, that is, loading the data it into the store and querying it, was straight-forward (beside overcoming some inconsistencies in the data).

From the approximately eight hours I worked on the demo, some 70% went into the former (shell scripts and XSLT), some 20% into the latter (4store handling via curl and creating the SPARQL queries) and the remaining 10% were needed to create the shiny figures and the screen-cast, above. To conclude: the only thing you really need to create a useful LOD app is a good idea which sources to use, the rest is pretty straight-forward and, in fact, fun 😉