I recently came back from a conference in Bahrain that focused on, among other things, artificial intelligence and machine learning in art. I am as excited as anybody about the potential to apply these new tools to art and art history, but we do not have all that much data about art in a format that is clean, accessible, and easy to analyze. Moreover, without quality data, these new machine learning tools do not add much value to the discourse and use of art.

Lack of data has caused other problems, as well. People debate the exact number (which is likely unknowable), but many people suggest that 15-20% of art in museums and on the market is either forged or misattributed. A lack of quality data on art in an easily accessible format contributes to this problem.

So how do we solve the problems around quantity, quality, and accessibility of data in art? This question has been my focus for the last five years as I have built out the Artnome database of artists’ complete works along with new analytics that can only be derived from such a database. However, tackling a problem of this scale requires collaboration and effort from many different experts and groups attacking the problem from many different angles, including museums, collectors, estates, galleries, and auction houses.

In this first part of my series on art and data, I speak with Neal Stimler, Senior Advisor at the Balboa Park Online Collaborative. Neal served over a decade at The Metropolitan Museum of Art in New York City in successive positions. He worked on rights and permissions, designed digitization workflows for The Met’s collection at scale, oversaw partnerships with the Google Cultural Institute and Wikimedia communities, among other organizations, and was the project manager for The Metropolitan Museum of Art’s Open Access program that launched in 2017. Neal’s expertise in cultural heritage has deep roots in data and digital asset management, but it also incorporates areas of practice that include copyright policy, education, public engagement, operations management, and cross-reality technologies.

JB: Thanks for joining us, Neal. Let’s start with the basics. What is Open Access?

NS: The term open access is derived from open academia, where the standard is Creative Commons Attribution license or better. Open-Access (OA) content - whether we are talking about a piece of art, a writing or other work - is free of most copyright and licensing restrictions and is often available to the user without a fee. For a work to be OA, the copyright holder grants everyone the ability to copy, use, and build upon the work without restriction. I recommend the essential book Open Access by Peter Suber and Creative Commons’ overview on the topic. The video that most inspired my work in Open Access was “A Shared Culture.” A key aspect of engaging Open Access, too, is awareness and dedication to supporting the public domain.

The adoption of open access in museums and the GLAM sector is relatively more recent than in the academy. In the cultural heritage sector, professionals and supporters center around the GLAM-Wiki and OpenGLAM communities of practice. These communities advocate for open-access policies for data, digital assets, and publications resources from galleries, libraries, archives, and museums (GLAMs). Practitioners within and external to cultural institutions build tools to make these world heritage resources available to the public for uses ranging from commercial to creative to scholarly.

JB: What is involved with a museum making its collection available online? How long does it take for a museum to transition from being closed to open access [OA]?

NS: Some resources to consult in this process include The Rights and Permissions Handbook (American Alliance of Museum OSCI 1st Edition; Rowman and Littlefield, 2nd Edition), “Copyright Checkpoint,” and the “Copyright Cortex.” Some museums may also consider RightsStatements.org and International Image Interoperability Framework (IIIF) to address back-end rights management and image services. The “Collections As Data” project and “Museum APIs” wiki may also be useful resources.

After performing a thorough rights assessment on the assets in question and after consulting with licensed legal counsel in their jurisdiction, museums then need build tools to provide mass self-serve access to data and digital assets sets. These tools typically come in the form of a museum's collection online website, a public application programming interface (API), and a GitHub repository of data in the .CSV and .JSON formats. Data should be offered with the same permissions and legal frameworks as associated image assets.

Importantly, for a data set to be useful to the broadest spectrum of the public, it must include not only identifying or “tombstone” data for objects, but also rich contextual data like object descriptions, provenance, bibliography, artist biographies, or other data that help users to interpret and understand objects. The API serves application developers and partners, while .CSV and .JSON formatted data mainly supports researchers and scholars. Open-access content should be hosted in partnership with crucial aggregation platforms such as Wikidata, Wikimedia Commons, and Internet Archive. Other partners and aggregators might be impactful given the nature of the type of collections. Museums, too, should be mindful to evaluate and make decisions with respect to cultural and ethical considerations of open access in collaboration with communities and scholars.



The process from being “closed” to going open access depends on an institution’s preparedness. An advanced level of digital transformation is required for an institution to manifest policies and deliver the necessary tools in order to provide quality open access services to the public. An absolute commitment to open access and sincere leadership are required at the executive level and upper-level management for open access initiatives to succeed. Open access should represent a broader philosophical shift across all aspects of the museum’s operations and programming. An internal working group or project team from relevant areas across the organization should be assembled. The internal group is led by a project manager who leads the project vision and has ultimate decision-making authority. Partnerships with allied organizations engaged with an institution’s users and working directly with Creative Commons is strongly recommended to implement the best practice approach.