If we were to have to invent the scholarly publishing system again from scratch today, what would it look like?

Our current system of publishing is basically identical to that of what it was in the 1990s, before the emergence of a vast array of internet-based technologies, loosely termed Web 2.0. A research paper is a 20th century format, published in a 17th century container – the journal.

Ironically, this system still persists despite the blatant fact that anyone can publish anything they want at the touch of a button these days. Yet scholarly publishing still usually takes months, and some times takes even years, just to upload content to the Web.

Why is the system so slow?

The main problem of the current system is that access and communication are moderated by entities who sustain their business models by prohibiting those things. As such, no-one has access to all information legally, access is conferred by financial or status privileges, and an enormous access barrier exists between users and providers.

Furthermore, these same publishers hold all of the strings controlling the same system.

Firstly, researchers MUST submit their work to certain journals they publish, as these convey some sort of ‘prestige’ factor upon researchers, which is used in all forms of evaluation. This is still the broad reality, whether we accept/agree with it or not, and no matter how much things are evolving away from this standard.

Secondly, libraries and research institutes MUST subscribe to these journals in order to sustain access for the researchers they represent. As each research paper is unique, there is no bargaining position for anyone, and secrecy often governs how much is paid and for what, making any form of fair competition impossible.

Publishers know this, and therefore moderate the system by taking the rights from those researchers, and using these to ‘convince‘ libraries into paying for access to that content. This is perhaps the simplest way of trying to describe the current system in any way that makes sense to me.

Whichever way you look at this, it is broken. It is biased, the incentives are skewed, it is closed, and it fails to meet even the most fundamental goal of communicating research to the maximum possible potential effect.

Even Open Access publication doesn’t solve this much, as the consumers still typically have to financially feed the publishing system one way or another, and this still only provides free access to a small, but increasingly large (~20-25%) portion of the published research literature.

The core problems

The biggest problems to overcome in any scholarly communication system are that of content moderation and evaluation and prestige. However, the former of these is already mostly performed by the research community for free, and we call it peer review. In the traditional model of peer, the actual effect of this is poorly understood. But it does have the traits of being closed, secretive, and exclusive, therefore making it far from any objective or rigorous standard of research evaluation.

Prestige is conferred by journals on to individuals, usually conveyed through the general reputation of a journal for publishing ‘high quality’ research, or the journal impact factor, an average measure of journal citations. Anyone with a basic understanding of numbers, or even a little bit of common sense, should be able to see why this doesn’t really make any sense either. Journal-level factors have no logical relationship with anything on the individual level, and even the common thinking behind this is backwards, as in reality it is journal brands that are build by researchers, and not the other way around.

So how do we fix this?

So any way of fixing scholarly communication has to accommodate these two factors: moderation and evaluation. How would a communication platform look based on Web 2.0 technologies that had to do both of these things?

Content moderation can be done via a simple community-level assessment. Let’s imagine a simple platform where researchers openly share content, similar to any of your favourite ‘social networking’ sites. Content on the platform can be upvoted, rated, ranked, and commented on, all at different levels or in different ways. Examples of how this works as a form of community-led moderation include Reddit, Amazon, and Stack Exchange. Low quality content receives lower ‘scoring’, lower ranking, and overall less attention. Vice versa for higher quality content.

This is basically the same as traditional peer review, but moderated by communities and as a process subsequent to the initial sharing of content. The same potential issues to do with anonymity/identification still exist. The properties of any moderation system should be openness, inclusivity, and transparency.

In terms of prestige, this could also be facilitated by communities through the same platform, for example through the same simple voting/ranking system (think Reddit) or a badge system of some sort, like we see on Stack Exchange. Prestige then is based on engagement within the platform, and the value of this engagement to the community.

The importance of these things are that it puts the two most powerful aspects of the communication system in the hands of the research community. Moderation and evaluation become decoupled from publishing, and instead tied to communication and community interaction. That represents a huge shift in how we do research.

Pros of the decoupled model

Greatly reduced costs of communication, can be reinvested into research/institutions (good for everyone except publishers) No subscription fees, no article-processing charges The only cost is the hosting and maintenance of the platform Publishing industry sustained by competing to publish content in traditional ‘paper’ format

All research has the chance to be openly discussed, moderated, and evaluated

Version control to update content. Think GitHub.

Publishers not in control of content; power is in the hands of researcher for dissemination and re-use

Prestige conferred through research communities on the platform

Evaluation is moderated, transparent, and accountable

Relies on the strength and engagement of research communities to function

Copyright retained by researchers

Cons of the decoupled model

Moderation and validation occurs subsequent to the communication

Massive decrease in publisher profits

Relies on the strength and engagement of research communities to function

As always, what am I missing? I’d be happy to update these images in the future (feel free to share btw – all content on this site is CC BY 4.0).