Article-level innovations continue to unite and disrupt the scholarly publishing scene, writes Stephanie Dawson

For many years, the scholarly journal or book series was the most important context for an article. The journal provided the framework in which an article was understood and valued – an exclusive venue, an expert editor, or an important society lent gravitas to a researcher’s work. To flip through the pages of the newest issue in the library was to put an article in the context of other ideas and authors in a topical selection. This context environment was rated and ranked with a journal 'impact factor' and readers used this to orient themselves in their communities. And of course the journal context continues to play a role for researchers.

But the internet has disrupted this framework of understanding with powerful new tools for search, discovery and evaluation on the level of the article and the individual. Speed and efficiency, combined with the explosive growth in journal and article numbers, have completely changed how researchers find information.

They no longer first encounter an article within the context of a journal, bound between glossy covers. They search for key words on global search platforms such as Google, Baidu, or PubMed. They find an article within the context of this search and use clues from the metadata such as title and abstract to assess whether it is relevant to them. High profile journal brands may continue to be indicators of quality, but researchers also rely on affiliations, funding body, number of citations or Altmetric score to evaluate an article.

Context has become multi-dimensional, publisher-independent and intrinsic to the article.

Publishers have responded in innovative ways to this new reality. At the recent annual conference of the Society for Scholarly Publishers there were numerous sessions on the importance of persistent identifiers to make content more discoverable and provide new kinds of context for readers. ORCID has been a central, independent player providing persistent identifiers for authors make sure that their publications are correctly attributed. It is a helpful context for the reader in evaluating an article to know whether the author has written one or 300 articles on a certain subject.

Many publishers have begun requiring an ORCID ID at submission and then automatically updating the author’s ORCID profile after publication. It has therefore become easier than ever to browse an interesting author’s works – without ORCID, this remains a challenge for authors named Wang or Smith. Persistent identifiers are a structural way for publishers to connect and interconnect their articles within the greater fabric of the scientific literature.

A similar movement is taking place on the level of funding bodies. CrossRef has introduced the Open Funder Registry to standardise a funder taxonomy and add funder information to the article metadata. The European OpenAIRE project is further helping researchers to link their research results with funding post-publication via institutional repositories. These efforts would allow a reader to filter searches by funder as a quality criterion. This is relatively new and is quite a bit of extra effort for publishers and authors, so some ask what is in it for them. The answer is more context at the article level, which translates to more discoverability and quality assurance in a global flood of nearly two million publications per year.

The same thing is happening on an institutional level. Companies such as Ringgold and GRID are standardising and normalising institutional names so that they can be added to article metadata and provide reliable, searchable information about author affiliations. Unfortunately, the industry is not yet at the point of collaborating on standards in this area, but this is certainly needed. Again, affiliations are an important part of the context of an article that can be a signal for readers of a certain quality niveau.

I could further mention portals such as DataCite and Figshare that give DOIs for data or publishers such as EMBO, F1000 Research and others which include peer review reports as part of the article. Behind all of these efforts is the assumption that the article must both function autonomously and interconnect via persistent identifiers with the whole body of scientific literature. It requires a high level of cooperation and meticulous attention to detail from publishers that that should be commended.

It is worth looking back to the first publisher collaboration of this kind – CrossRef. Started in 2000 with 12 founding members, it now includes essentially every serious academic publisher world-wide. CrossRef registers a DOI for each article. Publishers/typesetters can look up and link DOIs for articles in the literature list to provide the reader of the final product a superior experience as they can click directly through to a cited article of interest. References are at the beginning and core of the context that each article carries within its XML. They are article-based and publisher independent. And thanks to the internet and the publisher collaboration via CrossRef they are digital, standardised and linked.

So where will context at the article level take us in the future? For one example, at ScienceOpen we are building a platform to expose the multi-dimensional context of an article to support discoverability.

We started with references as the first carrier of context. We analysed the references of around million articles and created an article record for each reference then merged multiple instances to develop a network of citations. The result is that a researcher can now search in 14 million articles and article records and sort by citation number, but also easily explore the context around each individual article. The use of DOIs has also made it possible to track article usage in many places on the internet beyond citations, which Altmetric and other companies have built up to a growing standard. On ScienceOpen 14 million articles can be sorted and explored by altmetrics, as well as a variety of other parameters. And context can also be user generated post-publication with commenting, recommendations and peer-review functions.

I predict that article-level innovations will continue to unite and disrupt the publishing landscape. The introduction of an increasing number of standardised identifiers and XML metadata requirements makes the work of professional academic publishers ever more complex. But it also makes clear what their value proposition for authors must be – better tagging at the article level means richer context and increased discoverability for authors. CrossRef and other successful collaboration projects have shown that unity around persistent identifiers benefits all players, even as they open the doors to the internet’s disruptive power.

Stephanie Dawson is managing director at Science Open