Such a discrepancy between a dynamic history and remembered consistency could be a consequence of peer review processes being central to both scholarly identity as a whole and to the identity and boundaries of specific communities ( Moore et al. , 2017 ). Indeed, this story linking identity to peer review is taught to junior researchers as a community norm, often without the much-needed historical context. More work on how peer review, alongside other community practices, contributes to community building and sustainability would be valuable. Examining criticisms of conventional peer review and proposals for change through the lens of community formation and identity may be a productive avenue for future research.

As mentioned above, there is an increasing quantity and quality of research that examines how publication processes, selection, and peer review evolved from the 17th to the early 20th century, and how this relates to broader social patterns ( Baldwin, 2017a ; Baldwin, 2017b ; Fyfe et al. , 2017 ; Moxham & Fyfe, 2017 ). However, much less research critically explores the diversity of selection of peer review processes in the mid- to late-20th century. Indeed, there seems to be a remarkable discrepancy between the historical work we do have ( Baldwin, 2017a ; Gupta, 2016 ; Rennie, 2016 ; Shuttleworth & Charnley, 2016 ) and apparent community views that “we have always done it this way,” alongside what sometimes feels like a wilful effort to ignore the current diversity of practice. The result of this is an overall lack of evidence about the mechanics of peer review (e.g., time taken to review, conflict resolution, demographics of engaged parties, acceptance rates, quality of reviews, inherent biases, impact of referee training), both in terms of the traditional process and ongoing innovations, that obfuscates our understanding of the functionality and effectiveness of the present system ( Jefferson et al. , 2007 ). However, such a lack of evidence should not be misconstrued as evidence for the failure of these systems, but interpreted more as representing difficulties in empirically assessing the effectiveness of a diversity of practices in peer review.

In many cases, there is an attempt to link the goals of peer review processes with Mertonian norms ( Lee et al. , 2013 ; Merton, 1973 ) (i.e., universalism, communalism, disinterestedness, and organized scepticism) as a way of showing their relation to shared community values. The Mertonian norm of organized scepticism is the most obvious link, while the norm of disinterestedness can be linked to efforts to reduce systemic bias, and the norm of communalism to the expectation of contribution to peer review as part of community membership (i.e., duty). In contrast to the emphasis on supposedly shared social values, relatively little attention has been paid to the diversity of processes of peer review across journals, disciplines, and time (an early exception is Zuckerman & Merton (1971) ). This is especially the case as the (scientific) scholarly community appears overall to have a strong investment in a “creation myth” that links the beginning of scholarly publishing—the founding of The Philosophical Transactions of the Royal Society —to the invention of peer review. The two are often regarded to be coupled by necessity, largely ignoring the complex and interwoven histories of peer review and publishing. This has consequences, as the individual identity of a scholar is strongly tied to specific forms of publication that are evaluated in particular ways ( Moore et al. , 2017 ). A scholar’s first research article, doctoral thesis, or first book are significant life events. Membership of a community, therefore, is validated by the peers who review this newly contributed work. Community investment in the idea that these processes have “always been followed” appears very strong, but ultimately remains a fallacy.

In addition to being used to judge submitted material for acceptance at a journal, review comments provided to the authors serve to improve the work and the writing and analysis skills of the authors. This feedback can lead to improvements to the submitted work that are iterated between the authors, reviewers, and editor, until the work is either accepted or the editor decides that it cannot be made acceptable for their specific scientific journal. In other cases, it allows the authors to improve their work to prepare for a new submission to another venue. In both cases, a good (i.e., constructive) peer review should provide general feedback that allows authors to improve their skills and competency at preparing and presenting their research. In a sense, good peer review can serve as distributed mentorship.

The systematic use of external peer review has become entwined with the core activities of scholarly communication. Without approval through peer review to assess importance, validity, and journal suitability, research articles do not become part of the body of scientific knowledge. While in the digital world the costs of dissemination are very low, the marginal cost of publishing articles is far from zero (e.g., due to time and management, hosting, marketing, and technical and ethical checks). The economic motivations for continuing to impose selectivity in a digital environment, and applying peer review as a mechanism for this, have received limited attention or questioning, and are often simply regarded as how things are done. Use of selectivity is now often attributed to quality control, but may be more about building the brand and the demand from specific publishers or venues. Proprietary reviewer databases that enable high selectivity are seen as a good business asset. In fact, the attribution is based on the false assumption that peer review requires careful selection of specific reviewers to assure a definitive level of adequate quality, termed the “Fallacy of Misplaced Focus” by Kelty et al. (2008) .

In the following section, we summarize the ebb and flow of the debate around the various and complex aspects of conventional peer review. In particular, we highlight how innovative systems are attempting to resolve some of the major issues associated with traditional models, explore how new platforms could improve the process in the future, and consider what this means for the identity, role, and purpose of peer review within diverse research communities. The aim of this discussion is not to undermine any specific model of peer review in a quest for systemic upheaval, or to advocate any particular alternative model. Rather, we acknowledge that the idea of peer review is critical for research and advancing our knowledge, and as such we provide a foundation for future exploration and creativity in improving an essential component of scholarly communication.

In spite of such studies, there appears to be a widening gulf between the rate of innovation and the availability of quantitative, empirical research regarding the utility and validity of modern peer review systems ( Squazzoni et al. , 2017a ; Squazzoni et al. , 2017b ). This should be deeply concerning given the significance that has been attached to peer review as a form of community moderation in scholarly research. Indeed, very few journals appear to be committed to objectively assess their effectiveness during peer review ( Lee & Moher, 2017 ). The consequence of this is that much remains unknown about the “black box” of peer review, as it is sometimes called ( Smith, 2006 ). The optimal designs for understanding and assessing the effectiveness of peer review, and therefore improving it, remain poorly understood, as the data required to do so are often not available ( Bruce et al. , 2016 ; Galipeau et al. , 2015 ). This also makes it very hard to measure and assess the quality, standard, and consistency of peer review not only between articles and journals, but also on a system-wide scale in the scholarly literature. Research into such aspects of peer review is quite time-consuming and intensive, particularly when investigating traits such as validity, and often criteria for assessing these are based on post-hoc measures such as citation frequency.

1.1.4 Evidence from studies of peer review. Several empirical studies on peer review have been reported in the past few decades, mostly at the journal- or population-level. These studies typically use several different approaches to gather evidence on the functionality of peer review. Some, such as Bornmann & Daniel (2010b) ; Daniel (1993) ; Zuckerman & Merton (1971) , used access to journal editorial archives to calculate acceptances, assess inter-reviewer agreement, and compare acceptance rates to various article, topic, and author features. Others interviewed or surveyed authors, reviewers, and editors to assess attitudes and behaviours, while others conducted randomized controlled trials to assess aspects of peer review bias ( Justice et al. , 1998 ; Overbeke, 1999 ). A systematic review of these studies concluded that evidence supporting the effectiveness of peer review training initiatives was inconclusive ( Galipeau et al. , 2015 ), and that major knowledge gaps existed in our application of peer review as a method to ensure high quality of scientific research outputs.

The past five to ten years have seen an accelerating wave of innovation in peer review, which we term “the revolution” phase ( Figure 2 ; note that this is a non-exhaustive overview of the peer review landscape). Initiatives such as the San Francisco Declaration on Research Assessment ( ascb.org/dora/ ; DORA), that called for systemic changes in the way that scientific research outputs are evaluated, and advances in Web-based technologies, are likely catalysts for such innovation. Born-digital journals, such as the PLOS series, introduced commenting on published papers, and Rapid Responses by BMJ has been highly successful in providing a platform for formalised comments ( bmj.com/rapid-responses ). Such initiatives spurred developments in cross-publisher annotation platforms like PubPeer ( pubpeer.com/ ) and PaperHive ( paperhive.org/ ). Some journals, such as F1000 Research ( f1000research.com/ ) and The Winnower ( thewinnower.com/ ), rely exclusively on a model where peer review is conducted after the manuscripts are made publicly available. Other services, such as Publons ( publons.com/ ), enable reviewers to claim recognition for their activities as referees. Originally, Academic Karma ( academickarma.org/ ) offered a similar service to Publons , but has since adapted its model to facilitate peer review of preprints. Platforms such as ScienceOpen ( scienceopen.com/ ) provide a search engine combined with peer review across publishers on all documents, regardless of whether manuscripts have been previously reviewed. Each of these innovations has partial parallels to other social Web applications or platforms in terms of transparency, reputation, performance assessment, and community engagement. It remains to be seen whether these new models of evaluation will become more popular than traditional peer review, either singularly or in combination.

The launch of Open Journal Systems ( kp.sfu.ca/ojs/ ; OJS) in 2001 offered a step towards bringing journals and peer review back to their community-led roots, by providing the technology to implement a range of potential peer review models within a low-cost open source platform. As of 2015, the OJS platform provided the technical infrastructure and editorial and peer review workflow management support to more than 10,000 journals ( Public Knowledge Project, 2016 ). Its exceptionally low cost was perhaps responsible for around half of these journals appearing in the developing world ( Edgar & Willinsky, 2010 ).

1.1.3 The peer review revolution. In the last several decades, and boosted by the emergence of Web-based technologies, there have been substantial innovative efforts to decouple peer review from the publishing process ( Figure 2 ; Schmidt & Görögh (2017) ), and the ever increasing volume of published research. Much of this experimentation has been based on earlier precedents, and in some cases a total reversal back to historical processes. Such decoupling attempts have typically been achieved by adopting peer review as an overlay process on top of formally published research articles, or by pursuing a “publish first, filter later” protocol, with peer review taking place after the initial publication of research results ( BioMed Central, 2017 ; McKiernan et al. , 2016 ; Moed, 2007 ). Here, the meaning of “publication” becomes “making public,” as in the legal and common use as opposed to the scholarly publishing sense where it also implies peer reviewed, a trait unique to research scholarship. In fields such as Physics, Mathematics, and Economics, it is common for authors to send their colleagues either paper or electronic copies of their manuscripts for pre-submission evaluation. Launched in 1991, arXiv ( arxiv.org ) formalized this process by creating a central network for whole communities to access such e-prints. Today, arXiv has more than one million e-prints from various research fields and receives more than 8,000 monthly submissions ( arXiv, 2017 ). Here, e-prints or preprints are not formally peer reviewed prior to publication, but still undergo a certain degree of moderation by experts in order to filter out non-scientific content. This practice represents a significant shift, as public dissemination was decoupled from a formalised editorial peer review process. Such practice results in increased visibility and combined rates of citation for articles that are deposited both in repositories like arXiv and traditional journal venues ( Davis & Fromerth, 2007 ; Moed, 2007 ).

The result of this is that modern peer review has become enormously complicated. By allowing the process to become managed by a hyper-competitive publishing industry and integrated with academic career progression, developments in scholarly communication have become strongly coupled to the transforming nature of academic research institutes. These institutes have now evolved into internationally competitive businesses that strive for impact through journal publication. Often this is now mediated by commercial publishers through attempts to align their products with the academic ideal of research excellence ( Moore et al. , 2017 ). Such a consequence is plausibly related to, or even a consequence of, broader shifts towards a more competitive neoliberal academic culture ( Raaper, 2016 ). Here, emphasis is largely placed on production and standing, value, or utility ( Gupta, 2016 ), as opposed to the original primary focus of research on discovery and novelty.

This editor-led process of peer review became increasingly mainstream and important in the post-World War II decades, and is what we term “traditional” or “conventional” peer review throughout this article. Such expansion was primarily due to the development of a modern academic prestige economy based on the perception of quality or excellence surrounding journal-based publications ( Baldwin, 2017a ; Fyfe et al. , 2017 ). Peer review increasingly gained symbolic capital as a process of objective judgement and consensus. The term itself became formalised in research processes, borrowed from government bodies who employed it for aiding selective distribution of research funds ( Csiszar, 2016 ). The increasing professionalism of academies enabled commercial publishers to use peer review as a way of legitimizing their journals ( Baldwin, 2015 ; Fyfe et al. , 2017 ), and capitalized on the traditional perception of peer review as voluntary duty by academics to provide these services. A consequence of this was that peer review became a more homogenized process that enabled private publishing companies to thrive, and eventually establish a dominant, oligopolistic marketplace position ( Larivière et al. , 2015 ). This represented a shift from peer review as a more synergistic activity among scholars, to commercial entities selling it as an added value service back to the same academic community who was performing it freely for them. The estimated cost of peer review is a minimum of £1.9bn per year (in 2008; Research Information Network (2008) ), representing a substantial vested financial interest in maintaining the current process of peer review ( Smith, 2010 ). Neither account for overhead costs in publisher management, or the redundancy of the reject-resubmit cycle authors enter due to the competition for the symbolic value of journal prestige ( Jubb, 2016 ).

1.1.2 Adaptation through commercialisation. Peer review in forms that we would now recognize emerged in the early 19th century due to the increasing professionalism of science, and primarily through English scholarly societies. During the 19th century, there was a proliferation of scientific journals, and the diversity, quantity, and specialization of the material presented to journal editors increased. Peer evaluations evolved to become more about judgements of scientific integrity, but the intention of any such process was never for the purposes of gate-keeping ( Csiszar, 2016 ). Research diversification made it necessary to seek assistance outside the immediate group of knowledgeable reviewers from the journals’ sponsoring societies ( Burnham, 1990 ). Evaluation evolved to become a largely outsourced process, which still persists in modern scholarly publishing today. The current system of formal peer review, and use of the term itself, only emerged in the mid-20th century in a very piecemeal fashion (and in some disciplines, the late 20th century or early 21st; see Graf, 2014 , for an example of a major philological journal which began systematic peer review in 2011). Nature , now considered a top journal, did not initiate any sort of peer review process until at least 1967, only becoming part of the formalised process in 1973 ( nature.com/nature/history/timeline_1960s.html ).

Any discussion on innovations in peer review must appreciate its historical context. By understanding the history of scholarly publishing and the interwoven evolution of peer review, we recognize that neither are static entities, but covary with each other. By learning from historical experiences, we can also become more aware of how to shape future directions of peer review evolution and gain insight to what the process should look like in an optimal world. The actual term “peer review” only appears in the scientific press in the 1960s. Even in the 1970s, it was often associated with grant review and not with evaluation and selection for publishing ( Baldwin, 2017a ). However, the history of evaluation and selection processes for publication clearly predates the 1970s.

This article provides a general review of conventional journal article peer review and evaluation of recent and current innovations in the field. It is not a systematic review or meta-analysis of the empirical literature (i.e., we did not perform a formal search strategy undertaken with specific keywords). Rather, a team of researchers with diverse expertise in the sciences, scholarly publishing and communication, and libraries pooled their knowledge to collaboratively and iteratively analyze and report on the present literature and current innovations. The reviewed and cited articles within were identified and selected through searches of general research databases (e.g., Web of Science , Google Scholar , and Scopus ) as well as specialized research databases (e.g., Library & Information Science Abstracts (LISA) and PubMed ). Particularly relevant articles were used to seed identification of cited, citing, and articles related by citation. The team co-ordinated efforts using an online collaboration tool ( Slack ) to share, discuss, debate, and come to consensus. Authoring and editing was also done collaboratively and in public view using Overleaf . Each co-author independently contributed original content and participated in the reviewing, editing and discussion process.

The goal of this article is to investigate the historical evolution in the theory and application of peer review in a socio-technological context. We use this as the basis to consider how specific traits of consumer social Web platforms can be combined to create an optimized hybrid peer review model that we suggest will be more efficient, democratic, and accountable than existing processes.

Traditionally, the function of peer review has been as a vetting procedure or gatekeeper to assist the distribution of limited resources—for instance, space in peer reviewed print publication venues. With the advent of the internet, the physical constraints on distribution are no longer present, and, at least in theory, we are now able to disseminate research content rapidly and at relatively negligible cost ( Moore et al. , 2017 ). This has led to the innovation and increasing popularity of digital-only publication venues that vet submissions based exclusively on the soundness of the research, often termed “mega-journals” (e.g., PLOS ONE , PeerJ , the Frontiers series). Such a flexibility in the filter function of peer review reduces, but does not eliminate, the role of peer review as a selective gatekeeper, and can be considered to be “impact neutral.” Due to such digital experimentations, ongoing discussions about peer review are intimately linked with contemporaneous developments in Open Access (OA) publishing and to broader changes in open scholarship ( Tennant et al. , 2016 ).

Peer review is not a singular or static entity. It comes in various flavors that result from different approaches to the relative timing of the review in the publication cycle, the reciprocal transparency of the process, and the contrasting and disciplinary practices ( Ross-Hellauer, 2017 ). Such interdisciplinary differences have made the study and understanding of peer review highly complex, and implementing any systemic changes to peer review is fraught with the challenges of synchronous adoption between heterogeneous communities often with vastly different social norms and practices. The criteria used for evaluation, including methodological soundness or expected scholarly impact, are typically important variables to consider, and again vary substantially between disciplines. However, peer review is still often perceived as a “gold standard” of scholarly communication (e.g., D’Andrea & O’Dwyer (2017) ; Mayden (2012) ), despite the inherent diversity of the process and never an original intention to be used as such. Peer review is a diverse method of quality control, and applied inconsistently both in theory and practice ( Casnici et al. , 2017 ; Pontille & Torny, 2015 ), and generally lacks any form of transparency or formal standardization. As such, it remains difficult to know precisely what a “peer reviewed publication” means.

Peer review is a core part of our self-regulating global scholarship system. It defines the process in which professional experts (peers) are invited to critically assess the quality, novelty, theoretical and empirical validity, and potential impact of research by others, typically while it is in the form of a manuscript for an article, conference, or book ( Daniel, 1993 ; Kronick, 1990 ; Spier, 2002 ; Zuckerman & Merton, 1971 ). For the purposes of this article, we are exclusively addressing peer review in the context of manuscript selection for scientific research articles, with some initial considerations of other outputs such as software and data. In this form, peer review is becoming increasingly central as a principle of mutual control in the development of scholarly communities that are adapting to digital, information-rich, publishing-driven research ecosystems. Consequently, peer review is a vital component at the core of research communication processes, with repercussions for the very structure of academia, which largely operates through a peer reviewed publication-based reward and incentive system ( Moore et al. , 2017 ). Different forms of peer review beyond that for manuscripts are also clearly important and used in other contexts such as academic appointments, measurement time, research ethics or research grants (see, e.g., Fitzpatrick, 2011b , p. 16), but a holistic discussion of all forms of peer review is beyond the scope of the present article.

Axios Review was closed down in early 2017 due to a lack of uptake from researchers, with the founder stating: “I blame the lack of uptake on a deep inertia in the researcher community in adopting new workflows” ( Davis, 2017 ). Combined with the generally low uptake of decoupled peer review processes, this suggests the overall reluctance of many research communities to adapt outside of the traditional coupled model. In this section, we have discussed a range of different arguments, variably successful platforms, and surveys and reports about peer review. Taken together, these reveal an incredible amount of friction to experimenting with peer review beyond that which is typically and incorrectly viewed as the only way of doing it. Much of this can be ascribed to tensions between evolving cultural practices, social norms, and the different stakeholder groups engaged with scholarly publishing. This reluctance is emphasized in recent surveys, for instance the one by Ross-Hellauer (2017) suggests that while attitudes towards the principles of OPR are rapidly becoming more positive, faith in its execution is not. We can perhaps expect this divergence due to the rapid pace of innovation, which has not led to rigorous or longitudinal evidence that these models are superior to the traditional process at either a population or system-wide level (although see Kovanis et al. (2017) ). Cultural or social inertia, then, is defined by this cycle between low uptake and limited incentives and evidence. Perhaps more important is the general under-appreciation of this intimate relationship between social and technological barriers, that is undoubtedly required to overcome this cycle. The proliferation of social media over the last decade provides excellent examples of how digital communities can leverage new technologies for great effect.

While several new overlay journals are currently thriving, the history of their success is invariably limited, and most journals that experimented with the model returned to their traditional coupled roots ( Priem & Hemminger, 2012 ). Finally, it is probably worth mentioning that not a single overlay journal appears to have emerged outside of physics and math ( Priem & Hemminger, 2012 ). This is despite the fast growth of arXiv spin-offs like biorXiv , and potential layered peer review through services such as the recently launched Peer Community In ( peercommunityin.org ).

As recently as 2012, it was reported that relatively few platforms allowed users to evaluate manuscripts post-publication ( Yarkoni, 2012 ). Even platforms such as PLOS have a restricted scope and limited user base: analysis of publicly available usage statistics indicate that at the time of writing, PLOS articles have each received an average of 0.06 ratings and 0.15 comments (see also Ware (2011) ). Part of this may be due to how post-publication peer review is perceived culturally, with the name itself being anathema and considered an oxymoron, as most researchers usually consider a published article to be one that has already undergone formal peer review. At the present, it is clear that while there are numerous platforms providing decoupled peer review services, these are largely non-interoperable. The result of this, especially for post-publication services, is that most evaluations are difficult to discover, lost, or rarely available in an appropriate context or platform for re-use. To date, it seems that little effort has been focused on aggregating the content of these services (with exceptions such as Publons ), which hinders its recognition as a valuable community process and for additional evaluation or assessment decisions.

2.5.4 Limitations of decoupled peer review. Despite a general appeal for post-publication peer review and considerable innovation in this field, the appetite among researchers is limited, reflecting an overall lack of engagement with the process (e.g., Nature (2010) ). Such a discordance between attitudes and practice is perhaps best exemplified in instances such as the “#arseniclife” debate. Here, a high profile but controversial paper was heavily critiqued in settings such as blogs and Twitter, constituting a form of social post-publication peer review, occurring much more rapidly than any formal responses in traditional academic venues ( Yeo et al. , 2017 ). Such social debates are notable, but however have yet to become mainstream beyond rare, high-profile cases.

2.5.3 Peer Review by Endorsement. A relatively new mode of named pre-publication review is that of pre-arranged and invited review, originally proposed as author-guided peer review ( Perakakis et al. , 2010 ), but now often called Peer Review by Endorsement (PRE). This has been implemented at RIO , and is functionally similar to the Contributed Submissions of PNAS ( pnas.org/site/authors/editorialpolicies.xhtml#contributed ). This model requires an author to solicit reviews from their peers prior to submission in order to assess the suitability of a manuscript for publication. While some might see this as a potential bias, it is worth bearing in mind that many journals already ask authors who they want to review their papers, or who they should exclude. To avoid potential pre-submission bias, reviewer identities and their endorsements are made publicly available alongside manuscripts, which also removes any possible deleterious editorial criteria from inhibiting the publication of research. Also, PRE has been suggested by Jan Velterop to be much cheaper, legitimate, unbiased, faster, and more efficient alternative to the traditional publisher-mediated method ( theparachute.blogspot.de/2015/08/peer-review-by-endorsement.html . In theory, depending on the state of the manuscript, this means that submissions can be published much more rapidly, as less processing is required post-submission (e.g., in trying to find suitable reviewers). PRE also has the potential advantage of being more useful to non-native English speaking authors by allowing them to work with editors and reviewers in their first languages. However, possible drawbacks of this process include positive bias imposed by having author-recommended reviewers, as well as the potential for abuse through suggesting fake reviewers. As such, such a system highlights the crucial role of an Editor for verification and mediation.

2.5.2 Two-stage peer review and Registered Reports. Registered Reports represent a significant departure from conventional peer review in terms of relative timing and increased rigour ( Chambers et al. , 2014 ; Chambers et al. , 2017 ; Nosek & Lakens, 2014 ). Here, peer review is split into two stages. Research questions and methodology (i.e., the study design itself) are subject to a first round of evaluation prior to any data collection or analysis taking place ( Figure 4 ). Such a process is analogous to clinical trials registrations for medical research, the implementation of which became widespread many years before Registered Reports, and is a well-established specialised process that innovative peer review models could learn a lot from. If a protocol is found to be of sufficient quality to pass this stage, the study is then provisionally accepted for publication. Once the research has been finished and written-up, completed manuscripts are then subject to a second-stage of peer review which, in addition to affirming the soundness of the results, also confirms that data collection and analysis occurred in accordance with the originally described methodology. The format, originally introduced by the psychology journals Cortex and Perspectives in Psychological Science in 2013, is now used in some form by more than 70 journals ( Nature Human Behaviour, 2017 ) (see cos.io/rr/ for an up-to-date list of participating journals). Registered Reports are designed to boost research integrity by ensuring the publication of all research results, which helps reduce publication bias. As opposed to the traditional model of publication, where “positive” results are more likely to be published, results remain unknown at the time of the first review stage and therefore even “negative” results are equally as likely to be published. Such a process is designed to incentivize data-sharing, guard against dubious practices such as selective reporting of results (via so-called “p-hacking” and “HARKing”—Hypothesizing After the Results are Known) and low statistical power, and also prioritizes accurate reporting over that which is perceived to be of higher impact or publisher worthiness.

A similar approach to that of overlay journals is being developed by PubPub ( pubpub.org ), which allows authors to self-publish their work. PubPub then provides a mechanism for creating overlay journals that can draw from and curate the content hosted on the platform itself. This model incorporates the preprint server and final article publishing into one contained system. EPISCIENCES is another platform that facilitates the creation of peer reviewed journals, with their content hosted on digital repositories ( Berthaud et al. , 2014 ). ScienceOpen provides editorially-managed collections of articles drawn from preprints and a combination of open access and non-open venues (e.g., scienceopen.com/collection/Science20 ). Editors compile articles to form a collection, write an editorial, and can invite referees to peer review the articles. This process is automatically mediated by ORCID for quality control (i.e., reviewers must have more than 5 publications associated with their ORCID profiles), and CrossRef and Creative Commons licensing for appropriate recognition. They are essentially equivalent to community-mediated overlay journals, but with the difference that they also draw on additional sources beyond preprints.

2.5.1 Preprints and overlay journals. In fields such as mathematics, astrophysics, or cosmology, research communities already commonly publish their work on the arXiv platform ( Larivière et al. , 2014 ). To date, arXiv has accumulated more than one million research documents – preprints or e-prints – and currently receives 8000 submissions a month with no costs to authors. arXiv also sparked innovation for a number of communication and validation tools within restricted communities, although these seem to be largely local, non-interoperable, and do not appear to have disrupted the traditional scholarly publishing process to any great extent ( Marra, 2017 ). In other fields, the uptake of preprints has been relatively slower, although it is gaining momentum with the development of platforms such as bioRxiv and several newly established ones through the Center for Open Science , including engrXiv ( engrXiv.org ) and psyarXiv ( psyarxiv.com ). Social movements such as ASAPBio ( asapbio.org ) are helping to drive this expansion. Manuscripts submitted to these preprint servers are typically a draft version prior to formal submission to a journal for peer review, but can also be updated to include peer reviewed versions (often called post-prints). Primary motivation here is to bypass the lengthy time taken for peer review and formal publication, which means the timing of peer review occurs subsequent to manuscripts being made public. However, sometimes these articles are not submitted anywhere else and form what some regard as grey literature ( Luzi, 2000 ). Papers on digital preprint repositories are cited on a daily basis and much research builds upon them, although they may suffer from a stigma of not having the scientific stamp of approval of peer review ( Adam, 2010 ). Some journal policies explicitly attempt to limit their citation in peer-reviewed publications (e.g., Nature nature.com/nature/authors/gta/#a5.4 ), Cell cell.com/cell/authors ), and recently the scholarly publishing sector even attempted to discredit their recognition as valuable publications ( asapbio.org/faseb ). In spite of this, the popularity and success of preprints is testified by their citation records, with four of the top five venues in physics and maths being arXiv sub-sections ( scholar.google.com/citations?view_op=top_ venues&hl=en&vq=phy ). Similarly, the single most highly cited venue in economics is the NBER Working Papers server ( scholar.google.com/citations?view_op=top_venues&hl=en&vq=bus_economics ), according to the Google Scholar h5-index.

LIBRE ( openscholar.org.uk/libre ) is a free, multidisciplinary, digital article repository for formal publication and community-based evaluation. Reviewers’ assessments, citation indices, community ratings, and usage statistics, are used by LIBRE to calculate multiparametric performance metrics. At any time, authors can upload an improved version of their article or decide to send it to an academic journal. Launched in 2013, LIBRE was subsequently combined with the Self-Journal of Science ( sjscience.org ) under the combined heading of Open Scholar ( openscholar.org.uk ). One of the tools that Open Scholar offers is a peer review module for integration with institutional repositories, which is designed to bring research evaluation back into the hands of research communities themselves ( openscholar.org.uk/open-peer-review-module-for-repositories/ ). Academic Karma is another new service that facilitates peer review of preprints from a range of sources ( academickarma.org/ ).

Initiatives such as the Peerage of Science ( peerageofscience.org ), RUBRIQ ( rubriq.com ), and Axios Review ( axiosreview.org ; closed in 2017) have implemented decoupled models of peer review. These tools work based on the same core principles as traditional peer review, but authors submit their manuscripts to the platforms first instead of journals. The platforms provide the referees, either via subject-specific editors or via self-managed agreements. After the referees have provided their comments and the manuscript has been improved, the platform forwards the manuscript and the referee reports to a journal. Some journal policies accept the platform reviews as if the reviews were coming from the journal’s pool of reviewers, while others still require the journal’s handling editor to look for additional reviewers. While these systems usually cost money for authors, these costs can sometimes be deducted from any publication fees once the article has been published. Journals accept deduction of these costs because they benefit by receiving manuscripts that have already been assessed for journal fit and have been through a round of revisions, thereby reducing their workload. A consortium of publishers and commercial vendors recently established the Manuscript Exchange Common Approach (MECA; manuscriptexchange.org ) as a form of portable review in order to cut down inefficiency and redundancy. Yet, it still is in too early a stage to comment on its viability.

One proposal to transform scholarly publishing is to decouple the concept of the journal and its functions (e.g., archiving, registration and dissemination) from peer review and the certification that this provides. Some even regard this decoupling process as the “paradigm shift” that scholarly publishing needs ( Priem & Hemminger, 2012 ). Some publishers, journals, and platforms are now taking a more adventurous exploration of peer review that occurs subsequent to publication ( Figure 3 ). Here, the principle is that all research deserves the opportunity to be published (usually pending some form of initial editorial selectivity), and that filtering through peer review occurs subsequent to the actual communication of research articles (i.e., a publish then filter process). This is often termed “post-publication peer review,” a confusing terminology based on what constitutes “publication” in the digital age, depending on whether it occurs on manuscripts that have been previously peer reviewed or not ( blogs.openaire.eu/?p=1205 ), and a persistent academic view that published equals peer reviewed. Numerous venues now provide inbuilt systems for post-publication peer review, including RIO , PubPub , ScienceOpen , The Winnower , and F1000 Research . Some European Geophysical Union journals hosted on Copernicus offer a hybrid model with initial discussion papers receiving open peer review and comments and then selected papers accepted as final publications, which they term ‘Interactive Public Peer Review’ ( publications.copernicus.org/services/public_peer_review.html ). Here, review reports are posted alongside published manuscripts, with an option for reviewers to reveal their identity should they wish ( Pöschl, 2012 ). In addition to the systems adopted by journals, other post-publication annotation and commenting services exist independent of any specific journal or publisher and operating across platforms, such as hypothes.is , PaperHive , and PubPeer .

Applying a single, blanket policy across the entire peer review system regarding anonymity would greatly degrade the ability of science to move forward, especially without a wide flexibility to manage exceptions. The reasons to avoid one definite policy are the inherent complexity of peer review systems, the interplay with different cultural aspects within the various sub-sectors of research, and the difficulty in identifying whether anonymous or identified works are objectively better. As a general overview of the current peer review ecosystem, Nobarany & Booth (2016) recently recommended that, due to this inherent diversity, peer review policies and support systems should remain flexible and customizable to suit the needs of different research communities. For example, some publishers allow authors to opt in to double blinded review Palus (2015) , and others could expand this to offer a menu of peer review options. We expect that, by emphasizing the differences in shared values across research communities, we will see a new diversity of OPR processes developed across disciplines in the future. Remaining ignorant of this diversity of practices and inherent biases in peer review, as both social and physical processes, would be an unwise approach for future innovations.

While there are relatively few large-scale investigations of the extent and mode of bias within peer review (although see Lee et al. (2013) for an excellent overview), these studies together indicate that inherent biases are systemically embedded within the process, and must be accounted for prior to any further developments in peer review. This range of population-level investigations into attitudes and applications of anonymity, and the extent of any biases resulting from this, exposes a highly complex picture, and there is little consensus on its impact at a system-wide scale. However, based on these often polarised studies, it is inescapable to conclude that peer review is highly subjective, rarely impartial, and definitely not as homogeneous as it is often regarded.

2.4.3 The impact of identification and anonymity on bias. One of the biggest criticisms levied at peer review is that, like many human endeavours, it is intrinsically biased and not the objective and impartial process many regard it to be. Yet, the question is no longer about whether or not it is biased, but to what extent it is in different social dimensions - a debate which is very much ongoing (e.g., ( Lee et al. , 2013 ; Rodgers, 2017 ; Tennant, 2017 )). One of the major issues is that peer review suffers from systemic confirmatory bias, with results that are deemed as significant, statistically or otherwise, being preferentially selected for publication ( Mahoney, 1977 ). This causes a distinct bias within the published research record ( van Assen et al. , 2014 ), as a consequence of perverting the research process itself by creating an incentive system that is almost entirely publication-oriented. Others have described the issues with such an asymmetric evaluation criteria as lacking the core values of a scientific process ( Bon et al. , 2017 ).

In an ideal world, we would expect that strong, honest, and constructive feedback is well received by authors, no matter their career stage. Yet, there seems to be the very real perception that this is not the case. Retaliations to referees in such a negative manner can represent serious cases of academic misconduct ( Fox, 1994 ; Rennie, 2003 ). It is important to note, however, that this is not a direct consequence of OPR, but instead a failure of the general academic system to mitigate and act against inappropriate behavior. Increased transparency can only aid in preventing and tackling the potential issues of abuse and publication misconduct, something which is almost entirely absent within a closed system. COPE provides advice to editors and publishers on publication ethics, and on how to handle cases of research and publication misconduct, including during peer review. The Committee on Publication Ethics (COPE) could continue to be used as the basis for developing formal mechanisms adapted to innovative models of peer review, including those outlined in this paper. Any new OPR ecosystem could also draw on the experience accumulated by Online Dispute Resolution (ODR) researchers and practitioners over the past 20 years. ODR can be defined as “the application of information and communications technology to the prevention, management, and resolution of disputes” ( Katsh & Rule, 2015 ), and could be implemented to prevent, mitigate, and deal with any potential misconduct during peer review alongside COPE. Therefore, the perceived danger of author backlash is highly unlikely to be acceptable in the current academic system, and if it does occur, it can be dealt with using increased transparency. Furthermore, bias and retaliation exist even in a double blind review process ( Baggs et al. , 2008 ; Snodgrass, 2007 ; Tomkins et al. , 2017 ), which is generally considered to be more conservative or protective. Such widespread identification of bias highlights this as a more general issue within peer review and academia, and we should be careful not to attribute it to any particular mode or trait of peer review. This is particularly relevant for more specialized fields, where the pool of potential authors and reviewers is relatively small ( Riggs, 1995 ). Nonetheless, careful evaluation of existing evidence and engagement with researchers, especially higher-risk or marginalized communities (e.g., Rodríguez-Bravo et al. (2017) ), should be a necessary and vital step prior to implementation of any system of reviewer transparency. More training and guidance for reviewers, authors, and editors for their individual roles, expectations, and responsibilities also has a clear benefit here. One effort currently looking to address the training gap for peer review is the Publons Academy ( publons.com/community/academy/ ), although this is a relatively recent program and the effectiveness of it can not yet be assessed.

2.4.2 The dark side of identification. This debate of signed versus unsigned reviews, independently of whether reports are ultimately published, is not to be taken lightly. Early career researchers in particular are some of the most conservative in this area as they may be afraid that by signing overly critical reviews (i.e., those which investigate the research more thoroughly), they will become targets for retaliatory backlashes from more senior researchers ( Rodríguez-Bravo et al. , 2017 ). In this case, the justification for reviewer anonymity is to protect junior researchers, as well as other marginalized demographics, from bad behavior. Furthermore, author anonymity could potentially save junior authors from public humiliation from more established members of the research community, should they make errors in their evaluations. These potential issues are at least a part of the cause towards a general attitude of conservatism and a prominent resistance factor from the research community towards OPR (e.g., Darling (2015) ; Godlee et al. (1998) ; McCormack (2009) ; Pontille & Torny (2014) ; Snodgrass (2007) ; van Rooyen et al. (1998) ). However, it is not immediately clear how this widely-exclaimed, but poorly documented, potential abuse of signed-reviews is any different from what would occur in a closed system anyway, as anonymity provides a potential mechanism for referee abuse. Indeed, the tone of discussions on platforms where anonymity or pseudonymity is allowed, such as Reddit or PubPeer , is generally problematic, with the latter even being referred to as facilitating “vigilante science” ( Blatt, 2015 ). The fear that most backlashes would be external to the peer review itself, and indeed occur in private, is probably the main reason why such abuse has not been widely documented. However, it can also be argued that by reviewing with the prior knowledge of open identification, such backlashes are prevented, since researchers do not want to tarnish their reputations in a public forum. Under these circumstances, openness becomes a means to hold both referees and authors accountable for their public discourse, as well as making the editors’ decisions on referee and publishing choice public. Either way, there is little documented evidence that such retaliations actually occur either commonly or systematically. If they did, then publishers that employ this model, such as Frontiers or BioMed Central , would be under serious question, instead of thriving as they are.

2.4.1 Reviewing the evidence. Reviewer anonymity can be difficult to protect, as there are ways in which identities can be revealed, albeit non-maliciously. For example, through language and phrasing, prior knowledge of the research and a specific angle being taken, previous presentation at a conference, or even simple Web-based searches. Baggs et al. (2008) investigated the beliefs and preferences of reviewers about blinding. Their results showed double blinding was preferred by 94% of reviewers, although some identified advantages to an un-blinded process. When author names were blinded, 62% of reviewers could not identify the authors, while 17% could identify authors ≤10% of the time. Walsh et al. (2000) conducted a survey in which 76% of reviewers agreed to sign their reviews. In this case, signed reviews were of higher quality, were more courteous, and took longer to complete than unsigned reviews. Reviewers who signed were also more likely to recommend publication. In one study from the reviewers’ perspectives, Snell & Spencer (2005) found that they would be willing to sign their reviews and felt that the process should be transparent. Yet, a similar study by Melero & Lopez-Santovena (2001) found that 75% of surveyed respondents were in favor of reviewer anonymity, while only 17% were against it.

While there is much potential value in anonymity, the corollary is also problematic in that anonymity can lead to reviewers being more aggressive, biased, negligent, orthodox, entitled, and politicized in their language and evaluation, as they have no fear of negative consequences for their actions other than from the editor. ( Lee et al. , 2013 ; Weicher, 2008 ). In theory, anonymous reviewers are protected from potential backlashes for expressing themselves fully and therefore are more likely to be more honest in their assessments. Some evidence suggests that single-blind peer review has a detrimental impact on new authors, and strengthens the harmful effects of ingroup-outgroup behaviours ( Seeber & Bacchelli, 2017 ). Furthermore, by protecting the referees’ identities, journals lose an aspect of the prestige, quality, and validation in the review process, leaving researchers to guess or assume this important aspect post-publication. The transparency associated with signed peer review aims to avoid competition and conflicts of interest that can potentially arise for any number of financial and non-financial reasons, as well as due to the fact that referees are often the closest competitors to the authors, as they will naturally tend to be the most competent to assess the research ( Campanario, 1998a ; Campanario, 1998b ). There is additional evidence to suggest that double blind review can increase the acceptance rate of women-authored articles in the published literature ( Darling, 2015 ).

There are different levels of bi-directional anonymity throughout the peer review process, including whether or not the referees know who the authors are but not vice versa (single blind; the most common ( Ware, 2008 )), or whether both parties remain anonymous to each other (double blind) ( Table 1 ). Double blind review is based on the idea that peer evaluations should be impartial and based on the research, not ad hominem, but there has been considerable discussion over whether reviewer identities should remain anonymous (e.g., Baggs et al. (2008) ; Pontille & Torny (2014) ; Snodgrass (2007) ) ( Figure 3 ). Models such as triple-blind peer review even go a step further, where authors and their affiliations are reciprocally anonymous to the handling editor and the reviewers. This attempts to nullify the effects of one’s scientific reputation, institution, or location on the peer review process, and is employed at the open access journal Science Matters ( sciencematters.io ), launched in early 2016.

Unresolved issues with posting review reports include whether or not it should be conducted for ultimately unpublished manuscripts, and the impact of author identification or anonymity alongside their reports. Furthermore, the actual readership and usage of published reports remains ambiguous in a world where researchers are typically already inundated with published articles to read. The benefits of publicizing reports might not be seen until further down the line from the initial publication and, therefore, their immediate value might be difficult to convey and measure in current research environments. Finally, different populations of reviewers with different cultural norms and identities will undoubtedly have varying perspectives on this issue, and it is unlikely that any single policy or solution to posting referee reports will ever be widely adopted. Further investigation of the link between making reviews public and the impact this has on their quality would be a fruitful area of research to potentially encourage increased adoption of this practice.

When BioMed Central launched in 2000, it quickly recognized the value in including both the reviewers’ names and the peer review history (pre-publication) alongside published manuscripts in their medical journals in order to increase the quality and value of the process. Since then, further reflections on OPR ( Godlee, 2002 ) led to the adoption of a variety of new models. For example, the Frontiers series now publishes all referee names alongside articles, EMBO journals publish a review process file with the articles, with referees remaining anonymous but editors being named, and PLOS added public commenting features to articles they published in 2009. More recently launched journals such as PeerJ have a system where both the reviews and the names of the referees can optionally be made public, and journals such as Nature Communications and the European Journal of Neuroscience have also started to adopt this method.

Publishing peer review reports appears to have little or no impact on the overall process but may encourage more civility from referees. In a small survey, Nicholson & Alperin (2016) found that approximately 75% of survey respondents (n=79) perceived that public peer review would change the tone or content of the reviews, and 80% of responses indicated that performing peer reviews that would be eventually be publicized would not require a significantly higher amount of work. However, the responses also indicated that incentives are needed for referees to engage in this form of peer review. This includes recognition by performance review or tenure committees (27%), peers publishing their reviews (26%), being paid in some way such as with an honorarium or waived APC (24%), and getting positive feedback on reviews from journal editors (16%). Only 3% (one response) indicated that nothing could motivate them to participate in an open peer review of this kind. Leek et al. (2011) showed that when referees’ comments were made public, significantly more cooperative interactions were formed, while the risk of incorrect comments decreased, suggesting that prior knowledge of publication encourages referees to be more constructive and careful with their reviews. Moreover, referees and authors who participated in cooperative interactions had a reviewing accuracy rate that was 11% higher. On the other hand, the possibility of publishing the reviews online has also been associated with a high decline rate among potential peer reviewers, and an increase in the amount of time taken to write a review, but with a variable effect on review quality ( Almquist et al. , 2017 ; van Rooyen et al. , 2010 ). This suggests that the barriers to publishing review reports are inherently social, rather than technical.

Assessments of research articles can never be evidence-based without the verification enabled by publication of referee reports. However, they are still almost ubiquitously regarded as having an authoritative, and uniform, stamp of quality. The issue here is that the attainment of peer reviewed status will always be based on an undefined, and only ever relative, quality threshold due to the opacity of the process. This is in itself quite an unscientific practice, and instead, researchers rely almost entirely on heuristics and trust for a concealed process and the intrinsic reputation of the journal, rather than anything legitimate. This can ultimately result in what is termed the “Fallacy of Misplaced Finality”, described by Kelty et al. (2008) , as the assumption that research has a single, final form, to which everyone applies different criteria of quality.

In a study of two journals, one where reports were not published and another where they were, Bornmann et al. (2012) found that publicized comments were much longer by comparison. Furthermore, there was an increased chance that they would result in a constructive dialogue between the author, reviewers, and wider community, and might therefore be better for improving the content of a manuscript. On the other hand, unpublished reviews tended to have more of a selective function to determine whether a manuscript is appropriate for a particular journal (i.e., focusing on the editorial process). Therefore, depending on the journal, different types of peer review could be better suited to perform different functions, and therefore optimized in that direction. Transparency of the peer review process can also be used as an indicator for peer review quality, thereby potentially enabling the tool to predict quality in new journals in which the peer review model is known, if desired ( Godlee, 2002 ; Morrison, 2006 ; Wicherts, 2016 ). Journals with higher transparency ratings were less likely to accept flawed papers and showed a higher impact as measured by Google Scholar’s h5-index ( Wicherts, 2016 ).

The rationale behind publishing referee reports lies in providing increased context and transparency to the peer review process, and can occur irrespective of whether or not the reviewers reveal their identities. Often, valuable insights are shared in reviews that would otherwise remain hidden if not published. By publishing reports, peer review has the potential to become a supportive and collaborative process that is viewed more as an ongoing dialogue between groups of scientists to progressively assess the quality of research. Furthermore, the reviews themselves are opened up for analysis and inspection, including how authors respond to reviews, adding an additional layer of quality control and a means for accountability and verification. There are additional educational benefits to publishing peer reviews, such as training purposes or for journal clubs. Given the inconclusive evidence regarding the training of referees ( Galipeau et al. , 2015 ; Jefferson et al. , 2007 ), such practices might be further useful in highlighting our knowledge and skills gaps. At the present, some publisher policies are extremely vague about the re-use rights and ownership of peer review reports ( Schiermeier, 2017 ). The Peer Review Evaluation (PRE) service ( www.pre-val.org ) was designed to breathe some transparency into peer review, and provide information about the peer review itself without exposing the reports (e.g., mode of peer review, number of referees, rounds of review). While it describes itself as a service to identify fraud and maintain the integrity of peer review, it remains unclear whether it has achieved these objectives in light of the ongoing criticisms of the conventional process.

Whether such initiatives will be successful remains to be seen However, Publons was recently acquired by Clarivate Analytics , suggesting that the process could become commercialized as this domain rapidly evolves ( Van Noorden, 2017 ). In spite of this, the outcome is most likely to be dependent on whether funding agencies and those in charge of tenure, hiring, and promotion will use peer review activities to help evaluate candidates. This is likely dependent on whether research communities themselves choose to embrace any such crediting or accounting systems for peer review.

The Publons platform provides a semi-automated mechanism to formally recognize the role of editors and referees who can receive due credit for their work as referees, both pre- and post-publication. Researchers can also choose if they want to publish their full reports depending on publisher and journal policies. Publons also provides a ranking for the quality of the reviewed research article, and users can endorse, follow, and recommend reviews. Other platforms, such as F1000 Research and ScienceOpen , link post-publication peer review activities with CrossRef DOIs and open licenses to make them more citable, essentially treating them equivalent to a normal open access research paper. ORCID (Open Researcher and Contributor ID) provides a stable means of integrating these platforms with persistent researcher identifiers in order to receive due credit for reviews. ORCID is rapidly becoming part of the critical infrastructure for open OPR, and greater shifts towards open scholarship ( Dappert et al. , 2017 ). Exposing peer reviews through these platforms links accountability to receiving credit. Therefore, they offer possible solutions to the dual issues of rigor and reward, while potentially ameliorating the growing threat of reviewer fatigue due to increasing demands on researchers external to the peer review system ( Fox et al. , 2017 ; Kovanis et al. , 2016 ).

2.2.3 Progress in crediting peer review. Any acknowledgement model to credit reviewers also raises the obvious question of how to facilitate this model within an anonymous peer review system. By incentivizing peer review, much of its potential burden can be alleviated by widening the potential referee pool concomitant with the growth in review requests. This can also help to diversify the process and inject transparency into peer review, a solution that is especially appealing when considering that it is often a small minority of researchers who perform the vast majority of peer reviews ( Fox et al. , 2017 ; Gropp et al. , 2017 ); for example, in biomedical research, only 20 percent of researchers perform 70–95 percent of the reviews ( Kovanis et al. , 2016 ). In 2014, a working group on peer review services (CASRAI) was established to “develop recommendations for data fields, descriptors, persistence, resolution, and citation, and describe options for linking peer-review activities with a person identifier such as ORCID ” ( Paglione & Lawrence, 2015 ). The idea here is that by being able to standardize the description of peer review activities, it becomes easier to attribute, and therefore recognize and reward them.

2.2.2 Increasing demand for recognition. These traditional approaches of credit fall short of any sort of systematic feedback or recognition, such as that granted through publications. A change here is clearly required for the wealth of currently unrewarded time and effort given to peer review by academics. A recent survey of nearly 3,000 peer reviewers by the large publisher Wiley showed that feedback and acknowledgement for work as referees are valued far above either cash reimbursements or payment in kind ( Warne, 2016 ) (although Mulligan et al. (2013) found that referees would prefer either compensation by way of free subscriptions, or the waiver of colour or other publication charges). Wiley’s survey reports that 80% of researchers agree that there is insufficient recognition for peer review as a valuable research activity and that researchers would actually commit more time to peer review if it became a formally recognized activity for assessments, funding opportunities, and promotion ( Warne, 2016 ). While this may be true, it is important to note that commercial publishers have a vested interest in retaining the current, freely provided service of peer review, since this is what provides their journals the main stamp of legitimacy and quality (“added value”) as society-led journals. Therefore, one of the root causes for the lack of appropriate recognition and incentivization is publishers with have strong motivations to find non-monetary forms of reviewer recognition. Indeed, the business model of almost every scholarly publisher is predicated on free work by peer reviewers, and it is unlikely that the present system would function financially with market-rate reimbursement for reviewers. Other research shows a similar picture, with approximately 70% of respondents to a small survey done by Nicholson & Alperin (2016) indicating that they would list peer review as a professional service on their curriculum vitae. 27% of respondents mentioned formal recognition in assessment as a factor that would motivate them to participate in public peer review. These numbers indicate that the lack of credit referees receive for peer review is likely a strong contributing factor to the perceived stagnation of traditional models. Furthermore, acceptance rates are lower in humanities and social sciences, and higher in physical sciences and engineering journals ( Ware, 2008 ), as well as differences based on relative referee seniority ( Casnici et al. , 2017 ). This means there are distinct disciplinary variations in the number of reviews performed by a researcher relative to their publications, and suggests that there is scope for using this to either provide different incentive structures or to increase acceptance rates and therefore decrease referee fatigue ( Fox et al. , 2017 ; Lyman, 2013 ).

2.2.1 Traditional methods of recognition. One current way to recognize peer reviewers is to thank anonymous referees in the Acknowledgement sections of published papers. In these cases, the referees will not receive any public recognition for their work, unless they explicitly agree to sign their reviews. Generally, journals do not provide any remuneration or compensation for these services. Notable exceptions are the UK-based publisher Veruscript ( veruscript.com/about/who-we-are ) and Collabra ( collabra.org/about/our-model ), published by University of California Press, as well as most statistical referees ( Barbour, 2017 ). Other journals provide reward incentives to reviewers, such as free subscriptions or discounts on author-facing open access fees. Another common form of acknowledgement is a private thank you note from the journal or editor, which usually takes the form of an automated email upon completion of the review. In addition, journals often list and thank all reviewers in a special issue or on their website once a year, thus providing another way to recognise reviewers. Some journals even offer annual prizes to reward exceptional referee activities (e.g., the Journal of Clinical Epidemiology; www.jclinepi.com/article/S0895-4356(16)30707-7/fulltext ). Another idea that journals and publishers have tried implementing is to list the best reviewers for their journal (e.g., by Vines (2015a) for Molecular Ecology ), or, on the basis of a suggestion by Pullum (1984) , naming referees who recommend acceptance in the article colophon (a single blind version of this recommendation was adopted by Digital Medievalist from 2005–2016; see Wikipedia contributors, 2017 , and bit.ly/DigitalMedievalistArchive for examples preserved in the Internet Archive). Digital Medievalist stopped using this model and removed the colophon as part of its move to the Open Library of Humanities ; cf. journal.digitalmedievalist.org ). As such, authors can then integrate this into their scholarly profiles in order to differentiate themselves from other researchers or referees. Currently, peer review is poorly acknowledged by practically all research assessment bodies, institutions, granting agencies, as well as publishers, in the process of professional advancement or evaluation. Instead, it is viewed as expected or normal behaviour for all researchers to contribute in some form to peer review.

Of these, the latter two can both potentially reduce the quality of peer review and therefore affect the overall quality of published research. Paradoxically, while the Web empowers us to communicate information virtually instantaneously, the turn around time for peer reviewed publications remains quite long by comparison. One potential solution is to encourage referees by providing additional recognition and credit for their work. The present lack of bona fide incentives for referees is perhaps one of the main factors responsible for indifference to editorial outcomes, which ultimately leads to the increased proliferation of low quality research ( D’Andrea & O’Dwyer, 2017 ; Jefferson et al. , 2007 ; Wang et al. , 2016 ).

A vast majority of researchers see peer review as an integral and fundamental part of their work Mulligan et al. (2013) . They often consider peer review to be part of an altruistic cultural duty or a quid pro quo service, closely associated with the identity of being part of their research community. To be invited to review a research article can be perceived as a great honor, especially for junior researchers, due to the recognition of expertise—i.e., the attainment of the level of a peer. However, the current system is facing new challenges as the number of published papers continues to increase rapidly ( Albert et al. , 2016 ), with more than one million articles published in peer reviewed, English-language journals every year ( Larsen & Von Ins, 2010 ). Some estimates are even as high as 2–2.5 million per year ( Plume & van Weijen, 2014 ), and this number is expected to double approximately every nine years at current rates ( Bornmann & Mutz, 2015 ). Several potential solutions exist to make sure that the review process does not cause a bottleneck in the current system:

How and where we inject transparency has implications for the magnitude of transformation required and, therefore, the general concept of OPR is highly heterogeneous in meaning, scope, and consequences. A recent survey by OpenAIRE found 122 different definitions of OPR in use, exemplifying the extent of this issue. This diversity was distilled into a single proposed definition comprising seven different traits of OPR: participation, identity, reports, interaction, platforms, pre-review manuscripts, and final-version commenting ( Ross-Hellauer, 2017 ). The various parts of the “revolutionary” phase of peer review undoubtedly have different combinations of these OPR traits, and it remains a very heterogeneous landscape. Table 3 provides an overview of the advantages and disadvantages of the different approaches to anonymity and openness in peer review.

However, the context of this transparency and the implications of different modes of transparency at different stages of the review process are both very rarely explored. Progress towards achieving transparency has been variable but generally slow across the publishing system. Engagement with experimental open models is still far from common, in part perhaps due to a lack of rigorous evaluation and empirical demonstration that they are more effective processes. A consequence of this is the entrenchment of the ubiquitously practiced and much more favored traditional model (which, as noted above, is also diverse). However, as history shows, such a process is non-traditional but nonetheless currently held in high regard. Practices such as self-publishing and predatory or deceptive publishing cast a shadow of doubt on the validity of research posted openly online that follow these models, including those with traditional scholarly imprints ( Fitzpatrick, 2011a ; Tennant et al. , 2016 ). The inertia hindering widespread adoption of new models of peer review can be ascribed to what is often termed “cultural inertia”, and affects many aspects of scholarly research. Cultural inertia, the tendency of communities to cling to a traditional trajectory, is shaped by a complex ecosystem of individuals and groups. These often have highly polarized motivations (i.e., capitalistic commercialism versus knowledge generation versus careerism versus output measurement), and an academic hierarchy that imposes a power dynamic that can suppress innovative practices ( Burris, 2004 ; Magee & Galinsky, 2008 ).

Novel ideas about “Open Peer Review” (OPR) systems are rapidly emerging, and innovation has been accelerating over the last several years ( Figure 2 ; Table 3 ). The advent of OPR is complex, as the term can refer to multiple different parts of the process and is often used inter-changeably or conflated without appropriate prior definition. Currently, there is no formally established definition of OPR that is accepted by the scholarly research and publishing community ( Ford, 2013 ). The most simple definitions by McCormack (2009) and Mulligan et al. (2008) presented OPR as a process that does not attempt “to mask the identity of authors or reviewers” ( McCormack, 2009 , p.63), thereby explicitly referring to open in terms of personal identification or anonymity. Ware (2011 , p.25) expanded on reviewer disclosure practices: “Open peer review can mean the opposite of double blind, in which authors’ and reviewers’ identities are both known to each other (and sometimes publicly disclosed), but discussion is complicated by the fact that it is also used to describe other approaches such as where the reviewers remain anonymous but their reports are published.” Other authors define OPR distinctly, for example by including the publication of all dialogue during the process ( Shotton, 2012 ), or running it as a publicly participative commentary ( Greaves et al. , 2006 ).

New suggestions to modify peer review vary, between fairly incremental small-scale changes, to those that encompass an almost total and radical transformation of the present system. A core question is how to transform traditional peer review into a process that is aligned with the latest advances in what is now widely termed “open science”. This is tied to broader developments in how we as a society communicate, thanks to the inherent capacity that the Web provides for open, collaborative, and social communication. Many of the suggestions and new models for opening peer review up are geared towards increasing different levels of transparency, and ultimately the reliability, efficiency, and accountability of the publishing process. These traits are desired by all actors in the system, and increasing transparency moves peer review towards a more open model.

The recent diversification of peer review is intrinsically coupled with wider developments in scholarly publishing. When it comes to the gate-keeping function of peer review, innovation is noticeable in some digital-only, or “born open,” journals, such as PLOS ONE and PeerJ . These explicitly request referees to ignore any notion of novelty, significance, or impact, before it becomes accessible to the research community. Instead, reviewers are asked to focus on whether the research was conducted properly and that the conclusions are based on the presented results. This arguably more objective method has met some resistance, even receiving the somewhat derogatory term “peer review lite” from some corners of the scholarly publishing industry ( Pinfield, 2016 ). Such a sentiment can be viewed as a hangover from the commercial age of non-digital publishing, and now seems superfluous and discordant with any modern Web-based model of scholarly communication. Indeed, when PLOS ONE started publishing in 2006, it initiated the phenomenon of open access “mega journals”, which had distinct publishing criteria to traditional journals (i.e., broad scope, large size, objective peer review), and which have since become incredibly successful ventures ( Wakeling et al. , 2016 ). Some even view the desire for emphasis on novelty in publishing to have counter-productive effects on scientific progress and the organization of scientific communities ( Cohen, 2017 ), and journals based on the model of PLOS ONE represent a solution to this. The relative timing of peer review to publication is a further major innovation, with journals such as F1000 Research publishing prior to any formal peer review, with the process occurring continuously and articles updated iteratively. Some of the advantages and disadvantages of these different variations of peer review are explored in Table 2 .

Over time, three principal forms of journal peer review have evolved: single blind, double blind, and open ( Table 1 ). Of these, single blind, where reviewers are anonymous but authors are not, is the most widely-used in most disciplines because the process is considered to be more impartial, and comparably less onerous and less expensive to operate than the alternatives. Double blind peer review, where both authors and reviewers are reciprocally anonymous, requires considerable effort to remove all traces of the author’s identity from the manuscript under review ( Blank, 1991 ). For a detailed comparison of double versus single blind review, Snodgrass (2007) provides an excellent summary. The advent of “open peer review” introduced substantial additional complexity into the discussion ( Ross-Hellauer, 2017 ).

3 Potential future models

As we have discussed in detail above, there has been considerable innovation in peer review in the last decade, which is leading to widespread critical examination of the process and scholarly publishing as a whole (e.g., (Kriegeskorte et al., 2012)). Much of this has been driven by the advent of Web 2.0 technologies and new social media platforms, and an overall shift towards a more open system of scholarly communication. Previous work in this arena has described features of a Reddit-like model, combined with additional personalized features of other social platforms, like Stack Exchange, Netflix, and Amazon (Yarkoni, 2012). Here, we develop upon this by considering additional traits of models such as Wikipedia, GitHub, and Blockchain, and discuss these in the context of the rapidly evolving socio-technological environment for the present system of peer review. In the following section, we discuss potential future peer review platforms and processes in the context of the following three major traits, which any future innovation would greatly benefit from consideration of:

1. Quality control and moderation, possibly through openness and transparency;

2. Certification via personalized reputation or performance metrics;

3. Incentive structures to motivate and encourage engagement.

While discussing a number of principles that should guide the implementation of novel platforms for evaluating scientific work, Yarkoni (2012) argued that many of the problems researchers face have already been successfully addressed by a range of non-research focused social Web applications. Therefore, developing next-generation platforms for scientific evaluations should focus on adapting the best currently used approaches for these rather than on innovating entirely new ones (Neylon & Wu, 2009; Priem & Hemminger, 2010; Yarkoni, 2012). One important element that will determine the success or failure of any such peer-to-peer reputation or evaluation system is a critical mass of researcher uptake. This has to be carefully balanced with the demands and uptakes of restricted scholarly communities, which have inherently different motivations and practices in peer review. A remaining issue is the aforementioned cultural inertia, which can lead to low adoption of anything innovative or disruptive to traditional workflows in research. This is a perfectly natural trait for communities, where ideas out-pace technological innovation, which in turn out-paces the development of social norms. Hence, rather than proposing an entirely new platform or model of peer review, our approach here is to consider the advantages and disadvantages of existing models and innovations in social services and technologies (Table 4). We then explore ways in which such traits can be adapted, combined, and applied to build a more effective and efficient peer review system, while potentially reducing friction to its uptake.

Feature Description Pros Cons/Risks Existing models Voting or rating Quantified review evaluation

(5 stars, points), including

up- and down-votes Community-driven, quality

filter, simple and efficient Randomized procedure,

auto-promotion, gaming,

popularity bias, non-static Reddit, Stack

Exchange, Amazon Openness Public visibility of review

content Responsibility, accountability,

context, higher quality Peer pressure, potential

lower quality, invites

retaliation All Reputation Reviewer evaluation and

ranking (points, review

statistics) Quality filter, reward,

motivation Imbalance based on

user status, encourages

gaming, platform-specific Stack Exchange,

GitHub, Amazon Public

commenting Visible comments on paper/

review Living/organic paper,

community involvement,

progressive, inclusive Prone to harassment,

time consuming, non-

interoperable, low re-use Reddit, Stack

Exchange,

Hypothesis Version control Managed releases and

configurations Living/organic objects,

verifiable, progressive, well-

organized Citation tracking, time

consuming, low trust of

content GitHub, Wikipedia Incentivization Encouragement to engage

with platform and process via

badges/money or recognition Motivation, return on

investment Research monetization,

can be perverted by

greed, expensive Stack Exchange,

Blockchain Authentication

and

certification Filtering of contributors via

verification process Fraud control, author

protection, stability Difficult to manage Blockchain Moderation Filtering of inappropriate

behavior in comments, rating Community-driven, quality

filter Censorship, mainstream

speech Reddit, Stack

Exchange

3.1 A Reddit-based model Reddit (reddit.com) is an open-source, community-based platform where users submit comments and original or linked content, organized into thematic lists of subreddits. As Yarkoni (2012) noted, a thematic list of subreddits can be automatically generated for any peer review platform using keyword metadata generated from sources like the National Library of Medicine’s Medical Subject Headings (MeSH). Members, or redditors, can upvote or downvote any submissions based on quality and relevance, and publicly comment on all shared content. Individuals can subscribe to contribution lists, and articles can be organized by time (newest to oldest) or level of engagement. Quality control is invoked by moderation through subreddit mods, who can filter and remove inappropriate comments and links. A score is given for each link and comment as the sum of upvotes minus downvotes, thus providing an overall ranking system. At Reddit, highly scoring submissions are relatively ephemeral, with an automatic down-voting algorithm implemented that shifts them further down lists as new content is added, typically within 24 hours of initial posting. 3.1.1 Reddit as an existing “journal” of science. The subreddit for Science (reddit.com/r/science) is a highly-moderated discussion channel, curated by at least 600 professional researchers and with more than 15 million subscribers at the time of writing. The forum has even been described as “The world’s largest 2-way dialogue between scientists and the public” (Owens, 2014). Contributors here can add “flair” (a user-assigned tagging and filtering system) to their posts as a way of thematically organizing them based on research discipline, analogous to the container function of a typical journal. Individuals can also have flair as a form of subject-specific credibility (i.e., a peer status) upon provision of proof of education in their topic. Public contributions from peers are subsequently stamped with a status and area of expertise, such as “Grad student|Earth Sciences.” Scientists already further engage with Reddit through science AMAs (Ask Me Anythings), which tend to be quite popular. However, the level of discourse provided in this is generally not equivalent in depth compared to that perceived for peer review, and is more akin to a form of science communication or public engagement with research. In this way, Reddit has the potential to drive enormous amounts of traffic to primary research and there even is a phenomenon known as the “Reddit hug of death”, whereby servers become overloaded and crash due to Reddit-based traffic. The /r/science subreddit is viewed as a venue for “scientists and lay audiences to openly discuss scientific ideas in a civilized and educational manner”, according to the organizer, Dr. Nathan Allen (Lee, 2015). As such, an additional appeal of this model is that it could increase the public level of scientific literacy and understanding. 3.1.2 Reddit-style peer evaluation. The essential part of any Reddit-style model with potential parallels to peer review is that links to scientific research can be shared, commented on, and ranked (upvoted or downvoted) by the community. All links or texts can be publicly discussed in terms of methods, context, and implications, similar to any scholarly post-publication commenting system. Such a process for peer review could essentially operate as an additional layer on top of a preprint archive or repository, much like a social version of an overlay journal. Ultimately, a public commenting system like this could achieve the same depth of peer evaluation as the formal process, but as a crowd-sourced process. However, it is important to note here that this is a mode of instantaneous publication prior to peer review, with filtering through interaction occurring post-publication. Furthermore, comments can receive similar treatment to submitted content, in that they can be upvoted, downvoted, and further commented upon in a cascading process. An advantage of this is that multiple comment threads can form on single posts and viewers can track individual discussions. Here, the highest-ranked comments could simply be presented at the top of the thread, while those of lowest ranking remain at the bottom. In theory, a subreddit could be created for any sub-topic within research, and a simple nested hierarchical taxonomy could make this as precise or broad as warranted by individual communities. Reddit allows any user to create their own subreddit, pending certain status achievements through platform engagement. In addition, this could be moderated externally through ORCID, where a set number of published items in an ORCID profile are required for that individual to perform a peer review; or in this case, create a new subreddit. Connection to an academic profile within academia, such as ORCID, further allows community validation, verification, and judgement of importance. For example, being able to see whether senior figures in a given field have read or upvoted certain threads can be highly influential in decisions to engage with that thread, and vice versa. A very similar process already occurs at the Self Journal of Science (sjscience.org/), where contributors have a choice of voting either “This article has reached scientific standards” or “This article still needs revisions”, with public disclosure of who has voted in either direction. Threaded commenting could also be implemented, as it is vital to the success of any collaborative filtering platform, and also provides a highly efficient corrective mechanism. Peer evaluation in this form emphasizes progress and research as a discourse over piecemeal publications or objects as part of a lengthier process. Such a system could be applied to other forms of scientific work, which includes code, data and images, thereby allowing contributors to claim credit for their full range of research outputs. Comments could be signed by default, pseudonymous, or anonymized until a contributor chooses to reveal their identity. If required, anonymized comments could be filtered out automatically by users. A key to this could be peer identity verification, which can be done at the back-end via email or integrated via ORCID. 3.1.3 Translating engagement into prestige. Reddit karma points are awarded for sharing links and comments, and having these upvoted or downvoted by other registered members. The simplest implementation of such a voting system for peer review would be through interaction with any article in the database with a single click. This form of field-specific social recommendation for content simultaneously creates both a filter and a structured feed, similar to Facebook and Google+, and can easily be automated. With this, contributions get a rating, which accumulate to form a peer-based rating as a form of reputation and could be translated into a quantified level of community-granted prestige. Ratings are transparent and contributions and their ratings can be viewed on a public profile page. More sophisticated approaches could include graded ratings—e.g., five-point responses, like those used by Amazon—or separate rating dimensions providing peers with an immediate snapshot of the strengths and weaknesses of each article. Such a system is already in place at ScienceOpen, where referees evaluate an article for each of its importance, validity, completeness, and comprehensibility using a five-star system. For any given set of articles retrieved from the database, a ranking algorithm could be used to dynamically order articles on the basis of a combination of quality (an article’s aggregate rating within the system, like at Stack Exchange), relevance (using a recommendation system akin to Amazon), and recency (newly added articles could receive a boost). By default, the same algorithm would be implemented for all peers, as on Reddit. The issue here is making any such karma points equivalent to the amount of effort required to obtain them, and also ensuring that they are valued by the broader research community and assessment bodies. This could be facilitated through a simple badge incentive system, such as that designed by the Center for Open Science for core open practices (cos.io/our-services/open-science-badges/). 3.1.4 Can the wisdom of crowds work with peer review? One might consider a Reddit-style model as pitching quantity versus quality. Typically, comments provided on Reddit are not at the same level in terms of depth and rigor as those that we would expect from traditional peer review—as in, there is more to research evaluation than simply upvoting or downvoting. Furthermore, the range of expertise is highly variable due to the inclusion of specialists and non-specialists as equals (“peers”) within a single thread. However, there is no reason why a user prestige system akin to Reddit flair cannot be utilised to differentiate varying levels of expertise. The primary advantage here is that the number of participants is uncapped, therefore emphasizing the potential that Reddit has in scaling up participation in peer review. With a Reddit model, we must hold faith that sheer numbers will be sufficient in providing an optimal assessment of any given contribution and that any such assessment will ultimately provide a consensus of high quality and reusable results. Social review of this sort must therefore consider at what point is the process of review constrained in order to produce such a consensus, and one that is not self-selective as a factor of engagement rather than accuracy. This is termed the “Principle of Multiple Magnifications” by Kelty et al. (2008), which surmises that in spite of self-selectivity, more reviewers and more data about them will always be better than fewer reviewers and less data. The additional challenge here, then, will be to capture and archive consensus points for external re-use. Journals such as F1000 Research already have such a tagging system in place, where reviewers can mark a submission as approved after successive peer review iterations. “The rich get richer” is one potential phenomenon for this style of system. Content from more prominent researchers may receive relatively more comments and ratings, and ultimately hype, as with any hierarchical system, including that for traditional scholarly publishing. Research from unknown authors may go relatively under-noticed and under-used, but will at least have been publicized. One solution to this is having a core community of editors, drawing on the r/science subreddit’s community of moderators. The editors could be empowered to invite peers to contribute to discussion threads, essentially wielding the same executive power as a journal editor, but combined with that of a forum moderator. Recent evidence suggests that such intelligent crowd reviewing has the potential to be an efficient and high quality process (List, 2017).

3.2 An Amazon-style rate and review model Amazon (amazon.com/) was one of the first websites allowing the posting of public customer book reviews. The process is completely open to participation and informal, so that anyone can write a review and vote, providing usually that they have purchased the product. Customer reviews of this sort are peer-generated product evaluations hosted on a third-party website, such as Amazon (Mudambi & Schuff, 2010). Here, usernames can be either real identities or pseudonyms. Reviews can also include images, and have a header summary. In addition, a fully searchable question and answer section on individual product pages allows users to ask specific questions, answered by the page creator, and voted on by the community. Top-voted answers are then displayed at the top. Chevalier & Mayzlin (2006) investigated the Amazon review system finding that, while reviews on the site tended to be generally more positive, negative reviews had a greater impact in determining sales. Reviews of this sort can therefore be thought of in terms of value addition or subtraction to a product or content, and ultimately can be used to help guide a third-party evaluation of a product and purchase decisions (i.e., a selectivity process). 3.2.1 Amazon’s star-rating system. Star-rating systems are used frequently at a high-level in academia, and are commonly used to define research excellence, albeit perhaps in a flawed and an arguably detrimental way; e.g., the Research Excellence Framework in the UK (ref.ac.uk) (Mhurchú et al., 2017; Moore et al., 2017; Murphy & Sage, 2014). A study about Web 2.0 services and their use in alternative forms of scholarly communication by UK researchers found that nearly half (47%) of those surveyed expected that peer review would be complemented by citation and usage metrics and user ratings in the future (Procter et al., 2010a; Procter et al., 2010b). Amazon provides an example of a sophisticated collaborative filtering system based on five-star user ratings, usually combined with several lines of comments and timestamps. Each product is summarized with the proportion of total customer reviews that have rated it at each star level. An average star rating is also given for each product. A low rating (one star) indicates an extremely negative view, whereas a high rating (five stars) reflects a positive view of the product. An intermediate scoring (three stars) can either represent a mid-view of a balance between negative and positive points, or merely reflect a nonchalant attitude towards a product. These ratings reveal fundamental details of accountability and are a sign of popularity and quality for items and sellers. The utility of such a star-rating system for research is not immediately clear, or whether positive, moderate, or negative ratings would be more useful for readers or users. A superficial rating by itself would be a fairly useless design for researchers without being able to see the context and justification behind it. It is also unclear how a combined rate and review system would work for non-traditional research outputs, as the extremity and depth of reviews have been shown to vary depending on the type of content (Mudambi & Schuff, 2010). Furthermore, the ubiquitous five-star rating tool used across the Web is flawed in practice and produces highly skewed results. For one, when people rank products or write reviews online, they are more likely to leave positive feedback. The vast majority of ratings on YouTube, for instance, is five stars and it turns out that this is repeated across the Web with an overall average estimated at about 4.3 stars, no matter the object being rated (Crotty, 2009). Ware (2011) confirmed this average for articles rated in PLOS, suggesting that academic ranking systems operate in a similar manner to other social platforms. Rating systems also select for popularity rather than quality, which is the opposite of what scholarly evaluation seeks (Ware, 2011). Another problem with commenting and rating systems is that they are open to gaming and manipulation. The Amazon system has been widely abused and it has been demonstrated how easy it is for an individual or small groups of friends to influence the popularity metrics even on hugely-visited websites like Time 100 (Emilsson, 2015; Harmon & Metaxas, 2010). Amazon has historically prohibited compensation for reviews, prosecuting businesses who pay for fake reviews as well as the individuals who write them. Yet, with the exception that reviewers could post an honest review in exchange for a free or discounted product as long as they disclosed that fact. A recent study of over seven million reviews indicated that the average rating for products with these incentivized reviews was higher than non-incentivized ones (Review Meta, 2016). Aiming to contain this phenomenon, Amazon has recently decided to adapt its Community Guidelines to eliminate incentivized reviews. As mentioned above, ScienceOpen offers a five-star rating system for articles, combined with post-publication peer review, but here the incentive is simply that the review content can be re-used, credi