Ben Goldacre, The Guardian, Saturday 3 September 2011

This week George Monbiot won the internet with a long Guardian piece on academic publishers. For those who didn’t know: academics, funded mostly by the public purse, pay for the production and dissemination of academic papers; but for historical reasons, these are published by private organisations who charge around $30 per academic paper, keeping out any reader who doesn’t have access through their institution.

This is a barrier to the public understanding of science, but also to ongoing scholarship by people who’ve wandered away from institutional academia. There are open access alternatives, where academics pay up-front and the paper is free to all readers, but these are patchy, and require your funder to pay a thousand pounds per paper. If the journal your work is best suited for doesn’t do open access, then you might reasonably accept a closed access journal.

The arguments are big. What I find interesting is the recent rise of direct action on this issue.

Aaron Swartz is a fellow at Harvard’s Centre for Ethics, and a digital activist. He has been accused of intellectual property theft on a grand scale, and the federal indictment document, available in full online, describes an inspiringly nerdy game of cat and mouse.

Swartz denies all charges. Allegedly, he bought a laptop to harvest academic papers from the website JSTOR. Using a guest login at MIT – they last 14 days – he set a program running to download papers in bulk. JSTOR and MIT smelt a rat: they blocked access to whole ranges of computers in MIT, creating havoc. Swartz set two computers on the job, running so fast that several JSTOR servers stopped working.

So then, allegedly, he tried a slower approach. You’ll have seen racks of flashing network equipment in office buildings. He opened one up, in a quiet basement, plugged in a laptop, with some external hard drives, hid them under a box, and left this package quietly downloading papers by the million. Months later he was seen returning, peering cautiously through cracks in doors, carrying his bicycle helmet over his face and looking through the ventilation holes. He was arrested and bailed for $100,000: he had downloaded 4.8 million academic papers.

It’s hard not to be impressed, and this is not the first time Swartz has taken public data access into his own hands. In the US, court records are available online, but at a cost, in a scheme generating a $150 million budget surplus. When free access was given at 17 libraries, Swartz set up a script to harvest the lot. He got 19,856,160 pages before the system was shut down.

Now, the US government allege that Swartz intended to release his vast academic paper stash for free on file-sharing websites. This may be true, but he did not do so. Shortly after his arrest, however, a posting appeared on the Pirate Bay website, declaring the release of an immense file, free for download. It contains 33 gigabytes worth of academic papers from the UK journal Philosophical Transactions of the Royal Society. This file, explained the poster, was an act of protest about Swartz’s arrest. The papers in it range from the seventeenth century up to 1923, and are mostly out of copyright.

These are, in some respects, remarkable tales of Robin Hood behaviour. JSTOR expended huge effort on scanning these Royal Society papers in the 1990s, when scanning was tougher, and they should be thanked. But it’s hard to believe we can’t find any better way to do so: JSTOR sells each paper for between $8 and $19, while the Royal Society estimate that the pay-per-view income from the public accessing them is half a percent of their journal income.

One major problem with the current publishing model is that it’s hard to give access for free to the motivated public, while still gathering income from institutions. My hunch is, at some stage, this problem may be partially sidestepped, when someone manages an illegal workaround that individuals can play with, but which no university could endorse. I may be wrong: but either way, these are very interesting times for information.