This spring, 11 years after the first volunteer gave up a tube of blood, U.K. Biobank announced it would release its full genetic data set to registered scientists in July. This huge amount of genetic information, combined with the thousands of other characteristics tracked by U.K. Biobank, allows scientists to look for the genetic determinants of virtually any disease. Geneticists marked their calendars. “We heard stories that people who head groups had canceled holidays,” says Jonathan Marchini, a statistical geneticist at the University of Oxford. “Everyone has been waiting for this for so long.”

U.K. Biobank had done data releases before, including an earlier subset of the genetic data set with just over 100,000 people. In the past, research groups using the data wrote up their papers, submitted to journals, waited for peer review, and eventually their papers trickled out to the public. In the last year, however, an increasingly popular website called bioRxiv—pronounced “bio archive”—has changed the game. BioRxiv allows biologists to publish preprints, or preliminary drafts of their papers that have not yet been peer-reviewed.

Preprints based on the latest U.K. Biobank data started to come out almost immediately. Within two weeks, David Howard and Andrew McIntosh, psychiatry researchers at the University of Edinburgh, had posted not one but two preprints, one on genetic variants linked to depression and the other to neuroticism. Their team subsisted on pizza and worked “constantly.”

Others soon followed, and the flood of preprints has continued ever since. Never had genetics research moved so fast.

* * *

Ask scientists what’s so revolutionary about U.K. Biobank and they’ll say it’s big. But they’ll also say this: Nobody gets preferential access.

In the past, research groups that had gone through the trouble and expense of building DNA data sets have hoarded it for themselves, so that they could be the first to mine it for publishable insights. U.K. Biobank, however, is supported by the United Kingdom’s National Health Service. Its data is open to anyone in the world, as long as they are a legitimate researcher and pay a fee commensurate with the amount of data they want to access—a couple thousand dollars for the full genetic data.

When it came to releasing the 500,000-person data set, making sure everyone got the huge file (12 terabytes uncompressed) at the same time was no trivial matter. U.K. Biobank decided to allow registered researchers to start downloading the data weeks before its official July release. The catch: It was encrypted. The decryption keys went out to all research groups simultaneously on the official release date. Nobody got a head start of a few days, or even a few hours. Even Marchini, who helped U.K. Biobank process some of the data, was not allowed to analyze it for his own research purposes until it was available to all.