In the medical study Hall of Fame, the Framingham Heart Study takes the throne.

An ongoing project that’s spanned three generations and almost 70 years, the Heart study was an early attempt to track factors and behaviors that increase heart disease risks. Find and eliminate the culprits, lower a population’s chances of dying from a failing heart.

In a nutshell, prevention is the goal of most epidemiological studies. Yet running these longitudinal studies, in which researchers follow the same subjects for years, has always been a nightmare. Funding troubles aside, teasing out which biological factors to track—and how to manage and make sense of all that data—has been a major roadblock.

Enter Verily, a health-oriented offshoot of Google that’s now part of Alphabet, Inc.

Earlier this year, the mysterious company announced a collaboration with Duke and Stanford universities called Project Baseline, and it’s got a moonshot goal: stop cancer—and other killers—before they ever occur.

More specifically, the project is aiming to comprehensively map human health data from 10,000 people over four years. Taking advantage of Google’s immense computing power and machine learning prowess, the project hopes to develop more efficient ways of organizing the data, along with participants’ medical records, into an intuitive platform much like Google Earth.

Since launch, the project has already collected 6,700 terabytes of data, which the team is planning on returning to volunteers in a digestible manner.

Epidemiology on Steroids

Unlike many medical studies, the focus of Project Baseline is on healthy people.

A central tenet of Baseline is to track a wide array of people in exquisite detail before they get sick, to sketch out how medication, adverse life events, or age impact health in a broad community.

If you’re a volunteer, here’s how it would go.

First, you’d be asked to visit a clinic every year, much like the usual annual medical checkup. Here, expect almost every part of your body to be examined: heart health, eyes, chest x-rays, and physical strength. You’ll hand over some blood, saliva, and other bio-fluid samples. You’ll be given an extremely detailed questionnaire asking about your health and lifestyle.

Then there are quarterly appointments for a sub-group of participants, who may have recently experienced significant life events or began taking particular medications. Just married? Lost a loved one? Won the lottery? Project Baseline wants to sketch out how those experiences affect your health.

That’s already a lot of data. In addition, volunteers will also be asked to wear a Fitbit-like device that tracks their daily movement patterns, heart rhythms, and physical activity. Sleep habits and quality will be monitored with a friendly-looking puck that participants stick under the mattress.

There’s more. Volunteers will also have their microbiomes and genomes sequenced. These data could conceivably be mined to see how gut bugs—or the genetic hand of cards a person’s been dealt—contribute to health.

The Blue Button 2.0

If your head is spinning, you’re not alone. Thanks to the power of cloud computing platforms, only recently have such staggeringly large-scale studies even begun to become a possibility.

To Dr. Sanjiv Gambhir, a leading figure at the project and chair of radiology at Stanford, the project’s vision is to extend physician’s visits—short snapshots in time—into a comprehensive timeline of health and illness.

“Currently, most of what we see as treating physicians are short snapshots in time of an individual and primarily after they are already ill. We are effectively missing a lot of valuable information years prior to illness,” he said. “We’re dealing with illness in the absence of a well-defined reference of healthy biochemistry, and this underscores the criticality of what we hope to achieve here.”

The study’s current expected duration of four years is obviously not enough to sketch out a person’s entire medical history. In addition to data collection, Project Baseline is tackling electronic patient records.

In September, the project announced participation in a government initiative—the Blue Button 2.0—that lets participants share their Medicare claims data with the study. “Getting ahold of medical records can be an uphill battle,” the project explained. “The ability to connect health data between systems is called ‘interoperability’ and is a priority for the healthcare industry today.”

If a sufficient number of participants volunteer their claims data, the project could conceivably create broader data sets to identify trends or indicators that mark the transition from health to disease.

Given Google’s prowess in data organization, it’s not hard to see Project Baseline taking a shot at digitizing medical data—hopefully in a safe, secure, and responsible way.

Giving Back

In most epidemiological studies, the participants never get to see their data. ButProject Baseline is taking an “open” approach from the get-go.

“We’ve entered a new era…in which the people providing the data—whether a research participant or patient—expect their data to be accessible,” said project leader Dr. Charlene Wong at Duke, arguing that transparency increases trust in clinical research.

The team is seeking effective and meaningful ways to return results back to the patients, while tip-toeing along the ethical line of separating research from clinical care. Central to the effort, the project said, is establishing a committee of thought leaders in medicine, research, bioethics, and genetics to develop new guidelines in data-sharing.

The project is listening to each participant with the goal of tailoring disclosure to a volunteer’s preferences and expectations: after all, you wouldn’t want to force a genetic testing result on someone who doesn’t want to know. In contrast, those who embrace the “quantified self” movement may find the data trove especially appealing.

The project has already run into dilemmas here. Some participants were alerted to potential lethal conditions—such as cancer or blood clots—and sought medical help. Others were alarmed by fairly innocuous findings on X-rays that led them to believe—perhaps with the help of WebMD—that they had cancer.

Project Baseline is pushing the envelope in setting up a system that shares medical data with participants socially, morally, and ethically—and of course, without generating panic.

“We hope this work will become an early benchmark for other clinical studies in this new era of ubiquitous and ideally more transparent health data,” said Wong.

What Next?

Roughly 2,000 people have signed up for the study so far, with thousands more registered on the volunteer registry. The diversity of the enrolled population is already staggering: unlike the Framingham Heart Study, Project Baseline is involving black, Hispanic, Asian, and other ethnicities to elucidate different risk factors among people with diverse backgrounds.

It’s a welcome move for non-Caucasian folks, who—even with genetic sequencing results in hand—often have trouble interpreting those results because baseline results mostly come from European-Caucasian samples.

Project Baseline is just one of many studies enabled by the big data age. All of Us, an NIH-funded program, seeks to gather data from over one million people in the US. The U.K.’s 100,000 Genomes Project recently hit the halfway landmark, and a similar project in China launched last year.

“With recent advances at the intersection of science and technology, we have the opportunity to characterize human health with unprecedented depth and precision,” said Dr. Jessica Mega, chief medical officer at Verily. “We hope to create a dataset, tools and technologies that benefit the research ecosystem and humankind more broadly.”

Image Credit: piick / Shutterstock.com