22 September 2016 No Comment

The development of the acronym “FAIR” to describe open data was a stroke of genius. Standing for “Findable, Accessible, Interoperable and Reusable” it describes four attributes of datasets that are aspirations to achieve machine readability and re-use for an open data world. The short hand description provided by four attributes as well as a familiar and friendly word have led to its adoption as a touchstone for funders and policy groups including the G20 Hangzhao Concensus, the Amsterdam Call for Action on Open Science, the NIH Data Commons and the European Open Science Cloud.

At the FORCE11 Workshop on the Scholarly Commons this week in San Diego inclusion was a central issue. The purpose of the workshop was to work towards a shared articulation of principles or requirements that would define this shared space. To make any claim of a truly shared and global conception of this “scholarly commons” we clearly need to bake in inclusive processes. In particular we need our systems, rules and norms to remind us, at every turn, to consider different perspectives, needs and approaches. It is easy to sign up to principles that say there should be no barriers to involvement, but much harder to maintain awareness of barriers that we don’t see or experience.

The coining of FAIR was led by a community that want to emphasise that we need to expand our idea of audiences to include machine readers. As the Scholarly Commons discussion proceeded, and FAIR kept returning as a touch point I wondered whether we could use its traction for a further expansion, as a mnemonic that would remind us to consider the barriers that we don’t see ourselves. Can we embed in the idea of FAIR the inclusion of users and contributors from different geographies, cultures, backgrounds, and levels of access to research? And might something along the lines of making research “FAIR for All” achieve that?

As I looked at the component parts of FAIR it seemed like this could be a really productive approach:

Accessible

Originally conceived of as “available”, accessibility lends itself easily to expanding in scope to fit with this agenda. Can it be accessed without pay barriers online? Is it accessible to a machine? To a person without web access? To a speaker of a different language? To a non-expert? To someone with limited sight? There are many different types of accessibility but by forcing ourselves to consider a wider scope we can enhance inclusion. Many people have made excellent arguments that “access is not accessibility” and we can build on that strong base.

Interoperable

In the original FAIR Data Principles, Interoperability is concerned mainly with the use of standard descriptions language and how resources make reference to related resources. For our purposes we can ask what systems and cultures can a project or resource interoperate with. Is it useable by policy makers? Can it be disseminated via print shops where internet access is not appropriate? Does it link into widely used information systems like Wikipedia (and in the future WikiData). Does the form of the resource, or the project, make it incompatible with other efforts?

Re-usable

For machine use of data, re-usability is reasonably easily defined. If we seek to expand the definition it gets more diffuse. This is more than just licensing (although open licensing can help) but also relates to formats and design. Is software designed to be multilingual and are resources provided in a form that supports translation? Are documents provided in editable form as well as print or PDF? While accessibility, interoperability and re-usability are all clearly related they give us a different lens to check our commitment to inclusion.

Findability

As I thought through the four components it seemed that discoverability might not fit the agenda well, but as I thought it through it became clear that discoverability is perhaps the most important aspect to consider. As an extreme example, something indexed in Google, or available via Wikimedia Commons doesn’t help if there is no network access. But more generally, the way in which we all search for information shapes the things we discover, and is the first necessary condition for engagement. From the challenges of getting Open Access books into library catalogues to the question of how patients can efficiently search for relevant research, via the systemic problems of how consumer search engines increasingly fail to provide clear provenance for information, the issue of inclusion and engagement starts, and far too often ends, with the challenges of discovery.

Conclusions

A few things become clear in considering this expansion of scope. The FAIR Data principles provide some clear proscriptions and tests for compliance. Issues of inclusion are much more open ended. When have we done enough? What audiences do we need to consider? In that sense it becomes much more a direction of travel than an easily definable goal to reach. But actually that was the initial goal, to prompt and provoke us to think more.

It also expands the question of the thing that is FAIR. For FAIR data we need only consider the resource, generally a dataset. With this expansion it is clear that it is both resources and the projects that generate them that we need to consider. A project could generate FAIR outputs without being FAIR itself. But again, this is a journey, not a destination. If we can hold ourselves to a higher standard then we will make progress towards that goal. With limited resources there will be difficult choices to make, but we can still use this idea as a prompt, to ask ourselves if we can do better.

If our goal is to do research that is “FAIR for All”, then we can test ourselves as we improve towards that goal by continuing to ask ourselves at each stage.

Is this FAIR enough?