The open data movement is in full swing and tools and standards created in Ireland are to prove pivotal to open data employed by the US government. It emerged today that agencies in the US Government have adopted a set of web tools and standards developed in Ireland by researchers at NUI Galway’s Digital Enterprise Research Institute (DERI).

DERI’s technologies are being utilised by Data.gov, a portal developed to bring an unprecedented level of transparency to the US government. DERI’s research, which is funded by Science Foundation Ireland, focuses on enabling networked knowledge, using the latest semantic web and linked data technologies.

Its technologies allow related data that was not previously linked to be connected together, so that a person or computer can see the bigger picture through interlinked datasets. Data.gov allows the linking of open government data from agency publishers to contributions from other public and private organisations.

The move comes the same week that the inventor of the world wide web Sir Tim Berners-Lee was in Dublin to emphasise the power of linkable datasets, that the move by governments and enterprises to open data is inevitable and that the time is now to begin linking datasets to make the semantic web a reality. Berners-Lee is spearheading the UK’s open data movement Data.gov.uk.

"Open data can change the way government works, for example. Instead of traipsing around department to department looking for information, just access it online," Berners-Lee said at the 2012 Teradata Universe conference in Dublin this week. "The goal is to have the data available in a way that is more powerful, linkable and usable."

Berners-Lee pointed out that new languages like Sparkle are making it possible to link and sift through diverse data sets. He also emphasised the importance of HTML5 in making it possible to view data on all kinds of screens. “Think about the processing power of the web behind all these screens – people will soon be able to do the sort of thing that will make Minority Report look like child’s play.

“With HTML5, every web page is becoming a computing platform."

Stepping into the VOID – in the vanguard of the open data movement

Researchers at DERI in NUI Galway are in the vanguard of the open data space. The largest research organisation of its kind in the world, DERI, with its 140 researchers, is collaborating with industry and governments to revolutionise the utilisation of data.

DERI’s Dr John Breslin, who also lectures in electronic engineering at NUI Galway, explained: “I recently saw a universal toy adaptor that allowed you to connect plastic building blocks to wooden construction sets. Linked data is a bit like that – it’s based on a universal data format that allows you to bring datasets from different realms together, making them more useful as a whole. Your planning applications could be linked to your broadband penetration rates or your traffic congestion data to help identify issues and trends.”

Among the DERI outputs being used by Data.gov and the related Healthdata.gov site are Neologism and the GRefine RDF Extension. Neologism is a new tool which allows for the easy creation of ‘vocabularies’ needed to link data and is built on the powerful open source content management platform Drupal.

One such vocabulary that is listed in vocab.data.gov is the Vocabulary of Interlinked Datasets (VOID), which was co-created by DERI researchers. The second technology in use, the RDF Extension for Google Refine, is a graphical user interface for exporting data from Google Refine (a tool for working with messy data) as interlinked semantic web data.

George Thomas, enterprise architect with the US Health and Human Services Administration, has said: “More behind-the-scenes work that routinely benefits from substantial DERI engagement includes an ongoing contribution to the creation and promulgation of open standards related to open government data catalogs and communities. But DERI doesn’t stop there, they put these new standards into practice through enhancements to Drupal 7 core, helping make it an even more powerful publishing and visualisation tool for the emerging Web of Data.”

He added: “We hope to leverage all of these features and capabilities in our current and ongoing Healthdata.gov modernisation efforts. They also create lots of other useful tools and pen helpful blog posts that promote the proper use and integration of standards. Furthermore, DERI folks are active in many other efforts to promote structured data using open standards and help to clarify best practices that will ultimately lead to better integration of international government statistics.”

Making waves – DERI’s Social Software Unit

Joint work between DERI and Thomas’ team on Patient Controlled Privacy (using Linked Health Data) will be presented at the Semantic Technologies Conference in San Francisco in June, that makes use of the Privacy Preference Ontology and related privacy management web applications from DERI’s Social Software Unit.

Data.gov is part of a global initiative referred to as the open data movement, with the goal to motivate governments to make public information freely available and easily accessible online. Others examples include data.gov.uk and data.london.gov.uk from the UK, and data.fingal.ie and dublinked.ie from Ireland.

Today, more than 200 regions and countries are publishing their government data online. Three years ago, DERI announced the adoption of its SIOC data format by a website in the Obama administration.

The SIOC format is one of the open data formats being produced by a number of US Government websites that use the latest Drupal platform, including energy.gov (the US Energy Department), policy.house.gov (the Republican Policy committee), lsc.gov (the civil legal aid program), and oag.ca.gov (the California Attorney General). The DCAT vocabulary from DERI is also used by various government sites for describing government datasets and data catalogues. DERI also collaborates with the European Commission on common semantic vocabularies, such as the Asset Description Metadata Schema (ADMS).

Prof Stefan Decker, director of DERI at NUI Galway, says that while we are seeing open data being used to improve public services and promote more transparent and effective government – that is only part of the story. “Open data has been described recently by the UK’s Cabinet Office Minister Francis Maude as the raw material of a ‘new industrial revolution’. Making more data freely available is resulting in people using it to build new businesses and grow existing ones, creating jobs."

In Ireland, the open data movement is being pioneered by the likes of Fingal County Council, the Dublinked consortium and the National Cross-Industry Working Group on open data. DERI participates at a national and international level through the provision of best practices, standards and technologies. Open data is key to supporting a truly transparent and participatory democratic system.”

In Ireland, DERI collaborates closely with local and the Local Government Computer Services Board, as well as the National Cross-Industry Working Group on Open Data to promote open data.

Decker said: “These are exciting times and a true spirit of innovation and entrepreneurship is engulfing the IT world as networked knowledge begins to come into its own. Undoubtedly, 10 years from now, when we look back, we will wonder how we managed with the volumes of unconnected data we have now.”

DERI was founded in 2003 at NUI Galway with support from the Irish Government’s Science Foundation Ireland, as part of a strategic investment in Semantic Web research and business development.