It’s hard enough to come up with metadata for born-digital material such as a photograph or electronic documents, but how do you do it for a painting or a sculpture? Pittsburgh’s Carnegie Museum of Art has found a way, using open data.

“Have you ever wondered how artworks arrive at a museum?” writes Jeffrey Inscho for the Museum’s blog. “The journey over time and space that artworks make to arrive at a particular museum at a particular moment in time? Or maybe during a recent museum visit, as you explored a recent Impressionist exhibition, you noticed an interesting set of artworks arranged next to one another. Have they ever shared owners? Were they ever previously in the same city together? Did they, like their human counterparts, all experience a large-scale world event?”

With the museum’s Art Tracks system, you will be able to find out.

While many museums have ownership and transferal data—which is called provenance—online, the problem is there wasn’t a specifically defined way to represent it, Inscho writes. This made it difficult for people to do searches on the data and perform comparisons. “The current practice does not allow for research to be performed over aggregations of work,” write museum staffers Tracey Berg-Fulton, David Newbury, and Travis Snyder in a paper describing the Art Tracks project. “We can ask if this painting was in England during the 1920s, but we cannot ask which paintings were in England during the 1920s.”

As with any open data project, the museum found that the first step was developing a structure for this data, writes Suzette Lohmeyer in Government Computer News. “The Art Tracks team created a provenance standard, designed to resolve ambiguities and provide structure and machine readability,” she writes. “It also captures timeline data on the artwork’s acquisition method, its location, owner and length of ownership.”

The first phase of the development consists of four projects, according to the paper (which is, in all, a really lovely example of how to set up such a data project):

A recodification of the standard for writing provenance to allow automated structuring from provenance texts

A software parser that converts semi-structured text into structured data

A user interface that allows researchers to read, modify and verify the automated conversion

A prototype interactive gallery display will be built using the structured data

The best part about the system is that it maintains current workflows, according to the paper’s authors. “A comprehensive text-based provenance standard, paired with a software library that can parse records written using this standard and convert them into structured data, would allow existing workflows to remain in place while allowing structured data to be automatically extracted from provenance records,” they write. “The records could continue to be stored within existing CMS databases but contain machine-readable data for use in research and visualization.”

What this means is that the system will allow museum staff, visitors, and website users to see history unfolding as it traces the movement of collection objects through time and space, with the museum as the final destination, the authors write.

Determining provenance wasn’t always as simple as you might think. For example, it had to determine how it should handle a married couple that owned a piece of art—particularly for art before the 1900s. “The property of Mr. and Mrs. was really the property of the husband, and his alone to bequeath, sell, or give,” writes Berg-Fulton. “What this can mask is a wife’s influence on collecting activities. Women may also have negotiated prices with galleries, commissioned artists, held salons, and explicitly told husbands what to buy, but the work would have been assumed to be his, and so the record would imply that he was a keen connoisseur.”

While the provenance data can often be terse, it can also include a large amount of paper documentation, ranging from correspondence to notes scribbled in a margin, Berg-Fulton writes in a blog post. The Art Tracks system will also be able to accommodate this material, she writes.

One example of how this software could be used is with artwork confiscated by the Nazis, writes Lohmeyer. While there is already an online portal for such information, it was created manually in the early 2000s, she writes.

Like several other museum and library development projects, the Art Tracks software is open source, meaning that other museums will be able to take advantage of it as well. The code libraries and the user-facing entry tool Elysa are all available on GitHub, Lohmeyer writes.

“Standardized, structured data will also allow easy interchange between institutions, and as linked open data becomes less theoretical and more practical, this project positions provenance as another rich, linked data source,” the paper’s authors write.