UPDATED to include Sci-Hub data from six months: September 2015–February 2016, and correcting a coding error that inflated download counts.

Well, they might not have lost the downloads, but they didn’t get them.￼

Sci-Hub is a pirate operation that uses stolen university login credentials to harvest, store, and distribute for free virtually every academic article published anywhere. It is a simple, if criminal, solution to a very big problem: the lack of access to published research for people who can’t pay for it. When someone goes to the Sci-Hub site and requests an article, by simply pasting in the DOI or URL, the system either serves them the paper, or goes and steals it for them and then keeps a copy for the next user. For us university people who are used to dealing with the maze of logins and forwarding and proxies that come between us and the information we seek, it’s unbelievably fast and almost never fails.

Their most recent claim is an archive of 76 million papers and 400,000 users per day.

Today is 5 September and Sci-Hub is 8 years old. In eight years, the website grew from zero to 76,000,000 research articles available for free reading, and from 2,000 users per day to 400,000. — Sci Hub (@Sci_Hub) September 5, 2019

Currently available at sci-hub.se or –.tw, it sometimes moves, but this site always lists where you can find it now. Naturally, both civil and criminal authorities are trying to shut it down, preferably by catching its mastermind, Alexandra Elbakyan, the elusive student programmer from Kazakhstan.

That picture is from the excellent (free streaming) documentary Paywall: The Business of Scholarship. Chris Bourg, the Director of Libraries at MIT (and a sociologist), also interviewed in the movie, said of Sci-Hub:

Those of us who work in scholarly communications, writ large, really have to look at Sci-Hub as sort of a poke in the side that says, “Do better.” We need to look to Sci-Hub to say, “What is it that we could be doing differently about the infrastructure that we developed to distribute journal articles, to distribute scholarship?” … I think we need to look at what’s happening with Sci-Hub, how it evolved, who’s using it, who’s accessing it, and let it be a lesson to us for what we should be doing differently.

Sociology’s stolen papers

Science magazine writer John Bohannon reached Elbakyan in 2016, and she turned over to him a 6-month cache of Sci-Hub server logs for a piece titled, “Who’s downloading pirated papers? Everyone.” He analyzed 28 million downloads, and Science made the data available for analysis, here. Eight million of those hits were from India and China, and the busiest location was Tehran.

The data archive includes only the time and date, the DOI number of each paper downloaded, and the location of the user. I’m not expert in DOI analysis, but Bohannon included a guide that shows the prefix 10.1177 is associated with Sage Publications, which publishes the American Sociological Association’s journals. Looking at the entire six-month series, September 2015 — February 2016, I found 171,000 Sage items, downloaded 377,000 times. Of those (if I got the DOIs right), 805 titles downloaded 1628 times came from the ASA research journals (my Stata code is here).

ASA / Sage downloads from Sci-Hub, Sept 2015 – Feb 2016 Articles Downloads American Sociological Review 239 693 Teaching Sociology 221 269 Journal of Health and Social Behavior 94 188 Social Psychology Quarterly 77 152 Sociology of Education 73 157 Sociological Methodology 57 76 Sociological Theory 44 93 Total 805 1628

On an annualized basis, that would be 750,000 Sage downloads, and 3,200 from ASA journals specifically. For comparison, the most popular article in ASR in 2017 was downloaded about 10,000 times from the Sage site, so it’s a small share of the legitimate traffic. So over the life of Sci-Hub it cost (and saved) ASA thousands of downloads, probably a few tens of thousands. [Note in the first version of this post, I had a coding error that multiplied the counts, and this read “hundreds of thousands”. I regret the error.]

The most-downloaded ASR paper for the entire period was:

Mears, Ashley. 2015. “Working for Free in the VIP: Relational Work and the Production of Consent.” American Sociological Review 80 (6): 1099–1122. (downloaded 33 times)

The most-downloaded from a different journal was:

Kanazawa, Satoshi. 2010. “Why Liberals and Atheists Are More Intelligent:” Social Psychology Quarterly, February. (29 times)

I looked at a couple of them in more detail, and found, for example, that Paula England’s 2015 ASA Presidential Address was downloaded by users in Seoul (South Korea), Durban (South Africa), New Delhi, London, Chicago, Washington, and Virgie (Kentucky).

Interestingly, at least one of the popular papers, Lizardo et al.’s introduction to their editorial tenure at ASR, is already ungated on the Sage site, so you don’t need to use Sci-Hub to get it. This suggests, as Bohannon also noted, that some Sci-Hub users are just using the site because it’s convenient, not because they don’t have access to the papers.

Do you Sci-Hub?

I use Sci-Hub a lot, often for things that I also have subscription access to. (I do not, however, contribute anything to the system; I free-ride off their criminality.) Why? I’m not in the paywall game business, I’m in the information business. I am always behind on my work, and adding a few seconds or minutes of hunting for the legitimate way to get each of the many articles I look at every day is not worth it. (And when I find my university doesn’t subscribe? Interlibrary loan is wonderful, but I don’t want to spend more time with it than necessary.) Does my choice cost the American Sociological Association a few cents, by reducing legitimate downloads, which somehow factors into the profits that get kicked back to the association from Sage? I don’t know.

Of course, one of the dumb things about the paywall system is that it’s expensive and time-consuming to manage who has access to what information — it’s not a small task to keep information from reaching millions of determined readers from all around the world. (I assume one of the reasons my university recently introduced two-factor authentication — requiring me to click a pop-up on my phone every time I log in to university resources [even when I’m in my office] — is because of Sci-Hub. Ironic!)

Chris Bourg is right: “let it be a lesson to us for what we should be doing differently.” Elbakyan may have committed the most efficient product theft in history, in terms of list price of stolen goods per unit of effort or expense on her part. Her archive has been copied and distributed to different sites around the world (it fits in a large suitcase). And it was made possible by the irrational, corrupt nature of the scholarly communication infrastructure. Her success is the system’s failure.

For more information, read my report, “Scholarly Communication in Sociology.“