I recently had to solve the problem of how to take XML, in a predefined format, and create RDF representing the semantics of the data. I began using XSLT, but gradually the edge cases to handle inconsistencies in the input XML caused the XLST to become verbose and incomprehensible (being a mix of syntax handling and business logic). Errors were hard to diagnose and failures were not effectively recovered from. I decided to write a library to help me with this problem, called Tripliser…

>> Homepage | >> GitHub

Tripliser is a Java library and command-line tool for creating triple graphs, and RDF serialisations, from XML source data. It is particularly suitable for data exhibiting any of the following characteristics:

Messy – missing data, badly formatted data, changeable structure

– missing data, badly formatted data, changeable structure Bulky – large volumes of data

– large volumes of data Volatile – ongoing changes to data and structure, e.g. feeds

Other non-RDF source data may be supported in future such as CSV and SQL databases.

It is designed as an alternative to XSLT conversion, providing the following advantages:

Easy-to-read mapping format – concisely describing each mapping

– concisely describing each mapping Robust – error or partial failure tolerant

– error or partial failure tolerant Detailed reporting – comprehensive feedback on the successes and failures of the conversion process

– comprehensive feedback on the successes and failures of the conversion process Extensible – custom functions, flexible API

– custom functions, flexible API Efficient – facilities for processing data in large volumes with minimal memory usage

XML files are read in, and XPath is used to extract values which can be inserted into a triple graph. The graph can be serialised in various RDF formats and is accompanied by meta-data and a property-by-property report to indicate how successful or unsuccessful the mapping process was.

Here’s what a typical mapping format looks like…

<?xml version="1.0" encoding="UTF-8"?> <rdf-mapping xmlns="http://www.daverog.org/rdf-mapping" strict="false"> <constants> <constant name="objectsUri" value="http://objects.theuniverse.org/" /> </constants> <namespaces> <namespace prefix="xsd" url="http://www.w3.org/2001/XMLSchema#" /> <namespace prefix="rdfs" url="http://www.w3.org/2000/01/rdf-schema#" /> <namespace prefix="dc" url="http://purl.org/dc/elements/1.1/" /> <namespace prefix="universe" url="http://theuniverse.org/" /> </namespaces> <graph query="//universe-objects" name="universe-objects" comment="A graph for objects in the universe"> <resource query="stars/star"> <about prepend="${objectsUri}" append="#star" query="@id" /> <properties> <property name="rdf:type" resource="true" value="universe:Star"/> <property name="dc:title" query="name" /> <property name="universe:id" query="@id" /> <property name="universe:spectralClass" query="spectralClass" /> </properties> </resource> <resource query="planets/planet"> <about prepend="${objectsUri}" append="#planet" query="@id" /> <properties> <property name="rdf:type" resource="true" value="universe:Planet"/> <property name="dc:title" query="name" /> <property name="universe:id" query="@id" /> <property name="universe:adjective" query="adjective" /> <property name="universe:numberOfSatellites" dataType="xsd:int" query="satellites" /> </properties> </resource> </graph> </rdf-mapping>

Go to the Homepage or to GitHub to find out more.