The Disco - Hyperdata Browser is a simple browser for navigating the Semantic Web as an unbound set of data sources. The browser renders all information, that it can find on the Semantic Web about a specific resource, as an HTML page. This resource description contains hyperlinks that allow you to navigate between resources. While you move from resource to resource, the browser dynamically retrieves information by dereferencing HTTP URIs and by following rdfs:seeAlso links.

News

04.03.2007: SemanticWebCentral provides another Linked Data browser called Objectviewer.

03.10.2007: OpenLink has published a new Data Web Browser which, like Disco, also enables you to browse Linked Data on the Web.

01.16.2007: Ivan Herman has written a Disco Bookmarklet. You can add it to the bookmark bar of your Firefox browser. Whenever you visit a webpage that contains a link header element that refers to application/rdf+xml and select Ivan's bookmarklet in your bookmark bar, your browser will display the content of the current page with Disco. Makes Disco work at bit like Piggy Bank ;-)

01.15.2007: Initial release of the Disco browser.

1. Features

The browser is a server-side application that can be used without installing anything on your machine. You can start the browser by clicking on this link.

The screenshot below shows the browser user interface:

You start browsing the Semantic Web by entering a URI into the navigation box. After pressing the "Go!" button, the browser retrieves information about this resource from the Semantic Web. Retrieved information is displays as a property-value table. The third row of the table lists all sources that contain a specific piece of information. The abbreviations G1, G2, .. refer to the list of all sources that is shown below the table. If a piece of information occurs in multiple source graphs, the third table row contains more than one entry.

While you move from resource to resource by clicking on hyperlink in the resource descriptions, the browser stores all retrieved RDF graphs in a session cache. Clicking on the "Display all RDF graphs" link opens new browser window with a list of all retrieved RDF graphs as well as a list of all URIs that could not be dereferenced successfully.

2. Example Semantic Web Ride

The browser allows you to navigate an unbounded set of data sources. These data sources can be static RDF files somewhere on the Web as well as RDF graphs that are generated on request from relational databases or Web 2.0 APIs. In the following, we will describe an example Semantic Web ride starting with information about Tim Berner's Lee.

Click on this link or enter Tim's URI http://www.w3.org/People/Berners-Lee/card#i into the navigation box. The browser dereferences the URI and other related URIs that are found in Tim's FOAF profile and displays information about Tim from all retrieved RDF graphs. Click on "Wendy Hall" in the list of people that Tim knows. This brings you to a Semantic Web server running at the University of Southampton. The server provides information about people working at the university and their multiple projects. It is interesting to see that the fact that Wendy has the family name "Hall" apears in multiple RDF graphs. Move back to the page about Tim and click on "Tim Berners-Lee" in the row "sameAs". This brings you to a D2R Server at the Freie Universität Berlin. The server provides information about Tim's publications from the DBLP bibliographic database. Click on the links to further explore Tim's publications. Move back to Tim's page and click on "http://www4.wiwiss.fu-berlin.de/booksMeshup/books/006251587X" in the row "is Autor of". This brings you to information about the book "Weaving the Web" which is generated by the RDF Book Mashup by querying the Google Base and Amazon Web 2.0 APIs. By following the links in the book description, you can navigate to reviews about the book and to eshops offering the book. Clicking on the small arrow in the row "soldAt" within a offer brings you directly to the HTML interface of the eshop. Click on the "Display all RDF graphs" link in the section Session Cache. If you have followed the ride without taking any detours, your session cache should contain around 470 successfully retrieved graphs. 40 URIs failed to be dereferenced.

3. How does the Browser work?

The Semantic Web is a global information space consisting of linked data (sometimes also called hyperdata). For being part of the Semantic Web, data should fulfill the following requirements:

All entities of interest, such as information resources, real-world objects, and vocabulary terms should be identified by URI references. URI references should be dereferenceable, meaning that an application can look up a URI over the HTTP protocol and retrieve RDF data about the identified resource. Data should be provided using the RDF/XML syntax. If data is embedded inside other Web documents, for instance using Microformats inside an HTML page, then these documents should include hints how to automatically extract RDF data from them, for instance using GRDDL. Data should be interlinked with other data. Thus, resource descriptions should contain links to related information in the form of dereference-able URIs within RDF statements and rdfs:seeAlso links.

The Disco browser is implemented as a thin presentation layer on top of the Semantic Web Client Library. The Semantic Web Client Library regards all data that is published according to the rules above as a single, global set of Named Graphs. Whenever the browser asks the library for information about a specific resource, the library dynamically retrieves information from the Semantic Web using the following directed-browsing algorithm:

Dereference the URI x of the resource. Add the retrieved graph to the session cache. look up any URI y where the graph set includes the triple { x rdfs:seeAlso y }. Add retrieved graphs to the session cache. match the triple patterns (x any any) and (any any x) against all graphs in the session cache. for each triple that matches one of the triple patterns look up all new URIs that appear in the triple. Add retrieved graphs to the session cache. look up any new URI y where the new graphs includes the triple { x rdfs:seeAlso y } . Add retrieved graphs to the session cache. match the triple patterns (x any any) and (any any x) against all newly retrieved graphs.

In order to generate a human-readable presentation, the browser dereferences all property URIs and searches for rdfs:labels in the resulting RDF graphs. This works for all RDF vocabularies that are published on the Web according to the W3C Best Practice Recipes for Publishing RDF Vocabularies.

The Semantic Web Client Library is multithreaded to allow faster retrieval. The library is configured to

retrieve only graphs containing less than 50 000 triples.

The retrieval time out is set to 2 seconds.

The local cache is restricted to contain only up to 1500 graphs.

You can change these settings when you install the Disco browser on your own server.

4. Navigating Your Data

You can use the Disco browser to enable web surfers to navigate RDF data that you have published on the Web. Just set a link from your HTML website to the browser and pass a url-encoded URI identifying one of your resources as browse_uri parameter in order to provide the browser with a starting point.

http://yourServer/rdf_browser/?browse_uri=http%3A%2F%2Fdanbri.org%2Ffoaf%23danbri

Some examples links:

5. Related Work

Semantic Web browsers allow you to explore an unbounded set of RDF data sources on the Web. The best-known Semantic Web browser is the feature-rich Tabulator browser, developed by Tim Berners-Lee et al. at the Massachusetts Institute of Technology. Beside of exploring the Semantic Web, Tabulator allows you to render RDF data on a map or as a timeline and to query cached data using a point-and-click mechanism.

The Disco - Hyperdata Browser offers an alternative to the Tabulator browser. We think that having such a simple browser might be usefull for:

debuging Semantic Web sites,

explaining people the ideas behind linked data on the Web, and

to have a server-side alternative to Tabulator which works with all Web browsers and does not require any changes to the browser configuration.

Other browsers that can be used to display RDF data, but do not allow you to browse the Semantic Web as an unbounded set of data sources, are Longwell, mSpace, /facet, BrowseRDF, RDFgravity and IsaViz.

6. Support and Feedback

We are interested in hearing about your opinion and your experience with the browser. Please sent comments and bug reports to the NG4J-namedgraphs mailing list:

ng4j-namedgraphs@lists.sourceforge.net

The archives of the list are found at http://sourceforge.net/mailarchive/forum.php?forum=ng4j-namedgraphs

You can subscribe to the list at http://lists.sourceforge.net/lists/listinfo/ng4j-namedgraphs

7. Source Code

The source code of the Disco browser is available from the NG4J CVS. The Disco browser is licensed under the terms of the Berkeley Software Distribution (BSD) license.

(View with Disco)

Other open source projects @ Freie Universität Berlin