It’s spring and privacy concerns are in the air. Between the recent revelations that Facebook let Cambridge Analytica capture data from 87 million of its users to be improperly used to influence the US presidential election, and news that California investigators cracked the long-cold case of the Golden State Killer by running a genetic profile collected from crime scene DNA through a public genealogy website, people are feeling a bit...spooked.

So it’s kind of a weird time to be asking a million people to voluntarily hand over decades of health records, along with dozens of test tubes filled with blood and urine (which of course, contain DNA). But that’s exactly what the National Institutes of Health is doing today.

After more than three years of planning and piloting, the federal research organization is finally rolling out the massive precision health initiative President Obama first announced in 2015. Now renamed All of Us, the ambitious project aims to compile detailed health data from a representative sample of one million Americans so scientists can better understand the mechanisms of disease and move more quickly toward personalized treatments. Starting May 6, anyone over the age of 18 living in the US can enroll in this Grandest of Experiments and donate their data to the greater good.

So far, 45,000 people have already started the process. In May of 2017 All of Us began a beta phase, bringing its recruitment sites online one by one and making sure the systems were running smoothly. It’s got a lot of data to sync up—electronic health records, surveys about participants’ behaviors and environments, and eventually genetic reports and information from wearable fitness devices.

Building out the infrastructure necessary to collect so much data on such a huge cohort has taken time and some serious cash. Last year alone, the All of Us budget was $230 million. For the full project, which will run for a decade, Congress has authorized a whopping $1.455 billion. In addition to the 298 enrollment sites NIH hopes to launch by the end of this year (120 are online so far), that money will go toward a national biobank, run by the Mayo Clinic, where 35 blood and urine samples from each participant will one day be stored. To prepare for the national launch, Mayo doubled the size of its 35,000-square-foot facility in Minnesota and expanded a smaller bank in Florida, as a backup site to protect samples from any localized natural disasters.

Those samples contain the DNA that researchers will sequence, and in a rare first for a research project of this magnitude, they will also return the results to participants. But none of this will happen right away. The first sequencing will begin later this year, beginning with a small, 20,000 person pilot. Before everyone else can get the same treatment, someone has got to build a lot more sequencing machines. “There’s not enough capacity in the US to even begin to do a million people,” says Eric Dishman, director of All of Us. In addition to genotyping—the technique companies like 23andMe uses to create its limited health reports—All of Us will also be doing whole genome sequencing, which requires much more machinery. “It’d be like saying, “Hey, let’s all take a high speed train trip across the US. There just aren’t enough of them right now to do that.”

To handle the digital architecture, Dishman built a team made up of folks from Vanderbilt, the Broad Institute, and Verily—Alphabet’s life science subsidiary. They’re creating a data and research support center to collect, curate, and store participants’ information in a secure cloud environment. They’re also building analytical tools to help researchers comb through the data, looking for connections that could lead to new discoveries.