Kicking off the new year with 50 million article records!

Image by Epic Fireworks, Flickr, CC BY

We made it! ScienceOpen reached a major milestone: 50 million article records in 5 years of making science open! What’s more, this number is increasing faster and faster as we index more articles. ScienceOpen’s aggregation engine enables us to track citation genealogies and identify similar publications from published articles, making it possible to exponentially push the boundaries of our research discovery environment.

To mark our successful 5-year journey to 50 million records, ScienceOpen CEO Stephanie Dawson talks about the meaning of this milestone for ScienceOpen’s future and scholarly communication in general.

What does this major milestone mean for ScienceOpen?

50 million is a big number and highlights more than ever the importance of smart filtering tools to find and assess relevant research. As the number of articles has grown in our discovery environment, we have increasingly focused our attention on projects such as community curation with our Collections, and filters like open access, preprints or affiliation to help narrow down search results.

It is also important to remember that these 50 million articles are all in some way related to one another. Article records are added to ScienceOpen because they were cited by another article, part of a user’s ORCID profile or requested by a Collection editor. Most of those 50 million articles were cited at least once. Going forward, a better understanding of the relationships between ideas will help us to create smarter recommendations for researchers.

Can you elaborate on researcher-led Collections and how they facilitate discoverability?

A “Collection” on ScienceOpen can be a living, dynamic bibliography, or an “overlay” or “pop-up” journal that taps into our review infrastructure. It can provide a comprehensive overview of the literature on a particular subject, or it can highlight just the very best research in a field. We have by now over 300 Collections on the platform and we work closely with editors to understand their needs and to further develop this feature.

The Collection gives both users and editors an opportunity to see research articles in a new context beyond the journal container. From the beginning at ScienceOpen we have been interested in exploring the opportunities inherent in a digital, networked environment for scholarly communication.

Was that the idea behind ScienceOpen when it was founded in 2013?





Stephanie Dawson, ScienceOpen CEO

Back in 2013 we essentially envisioned the ScienceOpen discovery environment very much as it exists now, but with a narrower definition of Open Access. We began by aggregating only Open Access articles and using them to create context for articles we published in our own megajournal. However, we quickly understood from our Collection editors that they needed to be able to refer to the full range of published research. We did not want to add millions of articles to the platform and then only be able to search by date, so we began analyzing the references of our Open Access corpus and building a citation network and started a cooperation with Altmetric to add further sorting mechanisms. Simultaneously, we began building smarter search and curation tools to help our users keep pace with the growing number of articles. As our focus shifted from publishing to creating discovery and curation infrastructure, we began to work with other publishers to build a service portfolio embedded in the ScienceOpen technology.

Of course, we have also depended on partnerships along the way. We have always focused on digital and networked communication and therefore from the beginning participated in collaborative initiatives like ORCID and Crossref. The Initiative for Open Citations (I4OC) and Metadata 2020 have been important motors for getting more and better metadata into the system and we have been active in those groups. We also care deeply about research assessment – with 50 million article records – and were therefore early signers of DORA and helped found Peer Review Week.

Which development(s) in the last 5 years would you highlight as the most significant?

The last 5 years have certainly been a whirlwind. In the first round of development in 2013, the creation of a peer review infrastructure was probably our most significant achievement. We worked with our board of over 100 researchers, many experts in peer review, editors, and senior researchers, to refine the language and functionality. Our next big development in 2015 was the creation of our citation network and building out the search and discovery tools. In 2016 we launched the advanced user interface My ScienceOpen that allows authors to add lay summaries, keywords, and thumbnail images to their articles and track their usage in a dynamic dashboard. The last two years we have focused on our Collection functionality and developing our service portfolio for publishers. We have also kept an eye on developments in preprints adding new content, a special filter and a ScienceOpen preprint repository.

How does reaching 50 million aggregated articles play into ScienceOpen’s upcoming plans? What comes next?

Expect to see us helping publishers explore new ways curating content beyond the journal – to go beyond traditional silos and discover new potential in interdisciplinarity. Community curation will be the big focus of the year and we will be actively recruiting new Collection editors and listening hard to their needs. Kicking off the year with over 50 million articles records, we will also be thinking about new kinds of filters needed for Open Science. Openness is still at the core of what we do.

What makes ScienceOpen unique in the field of scholarly communications?

What makes ScienceOpen’s discovery environment unique is its interactive features. Because we see ourselves mainly as a linking hub, connecting readers to the publisher version of record, we don’t allow users to upload pdfs of their works. But we do encourage users to enrich their metadata for better discoverability and the entire community can profit from that. We help researchers to share their expertise and track their impact. Researchers can get involved in a range of ways and we are happy to provide tools and support for their projects.

The other unique feature on ScienceOpen is our publisher services within this discovery environment. From marketing to full Open Access hosting and publishing services, our powerful technology for understanding, exposing, and creating context can benefit journal publishers both large and small, Open Access and hybrid, HSS and STM. We are excited to see what new customers the new year will bring.

50 million articles testify to ScienceOpen’s achievements and contributions to open science in just 5 years. At ScienceOpen, we look forward to next milestones on our journey to making research easily discoverable. Celebrate with us by giving us feedback, creating a collection in your research field, or reviewing a preprint, poster, or an article!