Open Science and Science 2.0

Michał Kotowski (University of Toronto, Canada), http://www.math.toronto.edu/~michal/

Piotr Migdał (ICFO, Castelldefels (Barcelona), Spain), http://migdal.wikidot.com/en

Summary

In our talk we tried to outline what's wrong with the academic system of doing science and discuss various initiatives which try to improve the situation. We focused on how to make the publishing system more open and transparent (Open Science) and create completely new ways of collaborating and sharing scientific knowledge (so called "Science 2.0").

The prevailing attitude towards publishing has changed over centuries. In times of Galileo, making discoveries public was uncommon and served mainly the purpose of establishing priority. Over time the value of disseminating knowledge more widely was appreciated and this led to creation of academic journals. Today we have reached the stage of "publish or perish", where publishing lots of papers is a must if one wants to advance his academic career.

However, with the rise of the Internet it has become clear that the traditional framework of peer-reviewed journals is neither the most efficient nor the quickest way of sharing discoveries and knowledge. "To publish" originally meant "to make something public", but the current system of elite journals run (for the most part) by commercial publishers, with expensive articles hidden behind paywalls, actually helps to hide the results of research from the public.

The high price of publishing in journals was justified in the past, when printing paper copies of articles was expensive and the publisher did valuable service in this respect. Nowadays the cost of making your research available to everyone with access to the Internet is virtually zero, as evidenced for example by arXiv, an online open-access repository of preprints. Putting academic papers on arXiv has become standard in many fields of science; however, this has done little to undermine the monopoly of commercial publishers. As for now we are still stuck with an absurd system where authors and reviewers do their job completely for free, the public has to pay to access results of taxpayer-funded research and the whole profit goes exclusively to the publisher, who adds little real value in the process.

We tried to present a few possibilities of changing this state of affairs and explain why it is so hard to reform the system. The first step would be simply to realize the huge difference in the cost of publishing in traditional journals (either in the subscription model or "gold open access", although the latter is anything but open) and in venues such as arXiv. The real cost of publishing in a commercial journal is difficult to assess, but prices in the range of 1000$ per article are not considered exorbitant (see Tim Gowers' blog). In contrast, submitting a paper to arXiv is completely free and maintenance costs (paid voluntarily by universities) are low enough so that the net cost per article is about 25$. Coupled with the fact that arXiv publishes papers in a few days (as compared to months or years in the case of traditional publishing venues) and offers features such as version control and revising articles, there is little doubt that commercial publishing should be a thing of the past.

A biting but very apt description of how the traditional publishing system has essentially become a scam can be found in Scott Aaronson's ironic essay.

Another noteworthy initiative is the academic boycott of Elsevier, a leading publisher guilty of overpricing its journals, publishing crackpot or fake journals, supporting arms trade and other sins. The boycott has been started by the distinguished mathematician Tim Gowers on his blog and has gained such a momentum that as of now thousands of scientists have pledged not to publish in, review or do editorial work for Elsevier journals. Successful though the boycott has been in generating publicity for the open science movement, a lot remains to be done if we want to get rid of parasitic system of commercial publishers.

There are many factors which make any reforms so slow and difficult. Apart from the institutional inertia and conservatism of the academic world, one difficulty is organizng the peer-review system for new open journals or publishing venues, which is important (although probably not indispensable) for verifying accuracy and credibility of published research (note that arXiv is not peer-reviewed). Furthermore, the fact that activities other than publishing in renowned journals get little recognition in the academia, as far as e.g. promotion or getting a job are concerned, gives little incentive to fight the inefficient system (why waste time on reforming journals when there are so many new papers to write…). There are also other issues here which taken all together make it clear that the system will not magically "fix itself by itself".

Another aspect of the academic system which is probably not as good as it should be is fostering collaboration and exchange of knowledge. Somewhat ironically, even though the Internet was to a large extent created to help scientists exchange research results and contact each other, the academia has been slow to adopt the newest online tools already used in other communities.

An example of such a tool are StackExchange sites. An SE site is a question-and-answer type of service where registered users post concrete questions related to the site's main topic (for example programming, cooking, biology or photography) and other users can answer, comment and upvote/downvote them. The first site of this type was Stack Overflow, devoted to programming, and it proved a stunning success. With lots of contribution from various users, it has become a repository of programming know-how that would be impossible to assemble without the massive and community-oriented character of the initiative.

MathOverflow is a site using the same question-and-answer format, but for research-level mathematics. The majority of users are academic mathematicians and the clear-cut nature of most mathematical questions made this format especially well-suited to this field of science. The site has enjoyed tremendous success, with consistently high level of answers and contribution from even the most renowned mathematicians, such as Fields medalists Terence Tao or Tim Gowers (see this thread for an example of an awesome "success story").

However, it seems that establishing such a community enhancing exchange of scinetific knowledge is not easy, as evidenced by the unfortunate Theoretical Physics SE site. It was to follow the same format as Mathoverflow, but due to little interest on the part of users the project was cancelled after a few months (see also why did it fail). Other SE academic sites seem to fare better, but the example of TP.SE makes it clear that it is not enough to simply establish a website and wait for some type of "wiki-magic" to kick in.

There are numerous other interesting projects employing the Internet to enable more collaboration and unconventional ways of doing research. The Polymath project is a "massive multiplayer online problem solving" site for mathematicians. The idea is that an open mathematical problem is posted on the site and everyone can contribute to the solution by posting comments and ideas in an appropraite thread. The key point is that each user can contribute a relatively small insight or observation, but by building on other users' ideas the community as a whole can solve the problem relatively quickly. A few projects of this type have already been successful and have spawned papers published in peer-reviewed journals.

In addition, there are projects called Citizen Science, where everyone can participate with benefit for undergoing research. For example, Foldit is a game in which one folds proteins to gain points and at the same time solve puzzles for real science. Galaxy Zoo recruits astronomy enthusiasts to boost a project of classification of galaxies - a task which is easy for humans, hard for machines and impossible to do by only a few people, as there is a lot of data.

Furthermore, the Internet gives not only easier access to research done by great scientists, but also provides communication in both ways. On G+ there is a bunch of scientists (e.g. Terence Tao or John Baez) who not only share their own thoughts on mathematics or physics, but also respond to comments and engage in discussions. It's not uncommon to have similar illuminating discussions in comment threads on their blogs.

Last but not least, science is not only "the search for truth", but a lifestyle, a social and formal system, a career path. It used to be that one could learn the necessary skills and scientific know-how from one's advisor and colleagues, which may be crucial, but offers only a "short scale" interaction. Nowadays scientists are starting to exchange such advice in public, so that people outside the narrow circle of professor's own research group can benefit. For example, there are texts on making a good presentation, writing a good publication, there is a memoir from doing PhD, there is a collection of materials for nurturing scientists. Also, there is Academia.SE, where everyone can ask a question on the "soft" side of science.

Also, there is an issue with academia being a rather inert top-down system, which adopts changes at a sluggish pace and stifles creativity by a rigid hierarchic and grant-based system. In the academia a 20 year old kid, regardless of his or her skill, is extremely unlikely to lead a world-changing project, achieving a scientific success comparable with that of Microsoft, Linux or Facebook in computer industry. In practice one has to wait very long to gain real influence and independence, which means a delay of one generation in the mindset. Meanwhile young people in the most creative time in their lives are inhibited from using their ideas as freely as they should. Moreover, frequently grants actively discourage things which are too out-of-the-box and promote things which are safe enough to be finished in a given time with given funds.

Note that the most creative IT companies often take a strikingly different approach - see e.g. Valve on flat structure or Google's well known "bottom-up strategy".

So what can be done to improve the system? One thing is to promote tools which go beyond standard academic structure.

A few things in the spirit of Open Science that we would be interested in:

establish an efficient non-profit peer-review

create mechanisms for discerning the best publications

resurrect TheoreticalPhysics.SE (or understand why it won't work)

find a way of giving credit to scientists not only for publications and citations

do "open source" science - open in terms of both data and collaboration.

Currently we are planning to start a collaborative blog "Hacking Science", promoting "bottom-up" approach to science. Not by fighting the system head-on, but by starting an independent activity which works in parallel.

(Nomen omen, an open letter advertising Hacking Science is here. Also, if you are interested, feel free to mail us.)

Further reading

Slides

11.08.2012, an unrefined version, ~5MB, in Polish.