While many researchers encounter no privacy-based barriers to releasing data, those working with human participants, such as doctors, psychologists, and geneticists, have a difficult problem to surmount. How do they reconcile their desire to share data, allowing their analyses and conclusions to be verified, with the need to protect participant privacy? It's a dilemma we've talked about before on the blog (see: Open Data and IRBs, Privacy and Open Data). A new project, Open Humans, seeks to resolve the issue by finding patients who are willing—even eager—to share their personal data.

Open Humans, which recently won a $500,000 grant from the Knight Foundation, grew out of the Personal Genome Project. Founded in 2005 by Harvard genetics professor George Church, the Personal Genome Project sought to solve a problem that many genetics researchers had yet to recognize. "At the time people didn't really see genomes as inherently identifiable," Madeleine Price Ball explains. Ball is co-founder of OpenHumans, Senior Research Scientist at PersonalGenomes.org, and Director of Research at the Harvard Personal Genome Project. She quotes from 1000 Genomes' informed consent form: "'Because of these measures, it will be very hard for anyone who looks at any of the scientific databases to know which information came from you, or even that any information in the scientific databases came from you.'"

"So that's sort of the attitude scientists had towards genomes at the time. Also, the Genetic Information Nondiscrimination Act didn't exist yet. And there was GATTACA. Privacy was still this thing everyone thought they could have, and genomes were this thing people thought would be crazy to share in an identifiable manner. I think the scientific community had a bit of unconscious blindness, because they couldn't imagine an alternative."

Church found an initial ten participants—the list includes university professors, health care professionals, and Church himself. The IRB interviewed each of the participants to make sure they truly understood the project and, satisfied, allowed it to move forward. The Personal Genome Project now boasts over 3,400 participants, each of whom have passed an entrance exam showing that they understand what will happen to their data, and the risks involved. Most participants are enthusiastic about sharing. One participant described it as "donating my body to science, but I don't have to die first."

Growing pains

The Personal Genome Project's expansion hasn't been without growing pains. "We've started to try to collect data beyond genomes." Personal health information, including medical history, procedures, test results, prescriptions, has been provided by a subset of participants. "Every time one of these new studies was brought before the IRB they'd be like ‘what? that too?? I don't understand what are you doing???' It wasn't scaling, it was confusing, the PGP was trying to collect samples and sequence genomes and it was trying to let other groups collect samples and do other things."

Thus, Open Humans was born.

"Open Humans is an abstraction that takes part of what the PGP was doing (the second part) and make it scalable," Ball explains. "It's a cohort of participants that demonstrate an interest in public data sharing, and it's researchers that promise to return data to participants."

Open Humans will start out with a number of participants and an array of public data sets, thanks to collaborating projects American Gut, Flu Near You, and of course, the Harvard Personal Genome Project. Participants share data and, in return, researchers promise to share results. What precisely "sharing results" means has yet to be determined.

"We're just starting out and know that figuring out how this will work is a learning process," Ball explains. But she's already seen what can happen when participants are brought into the research process—and brought together:

"One of the participants made an online forum, another a Facebook group, and another maintains a LinkedIn group ... before this happened it hadn't occurred to me that abandoning the privacy-assurance model of research could empower participants in this manner. Think about the typical study—each participant is isolated, they never see each other. Meeting each other could breach confidentiality! Here they can talk to each other and gasp complain about you. That's pretty empowering."

Ball and her colleague Jason Bobe, Open Humans co-founder and Executive Director of PersonalGenomes.org, hope to see all sorts of collaborations between participants and researchers. Participants could help researchers refine and test protocols, catch errors, and even provide their own analyses.

The road ahead

Despite these dreams, Ball is keeping the project grounded. When asked whether Open Humans will require articles published using their datasets to be made open access, she replies that, "stacking up a bunch of ethical mandates can sometimes do more harm than good if it limits adoption." Asked about the effect of participant withdrawals on datasets and reproducibility, she responds, "I don't want to overthink it and implement things to protect researchers at the expense of participant autonomy based on just speculation." (It is mostly speculation. Less than 1% of Personal Genome Project users have withdrawn from the study, and none of the participants who've provided whole genome or exome data have done so.)

It's clear that Open Humans is focused on the road directly ahead. And what does that road look like? "Immediately, my biggest concern is building our staff. Now that we won funding, we need to hire a good programmer... so if you are or know someone that seems like a perfect fit for us, please pass along our hiring opportunities." She adds that anyone can join the project's mailing list to get updates and find out when Open Humans is open to new participants—and new researchers:

"And just talk about us. Referring to us is an intangible but important aspect for helping promote awareness of participant-mediated data sharing as a participatory research method and as a method for creating open data."

In other words: start spreading the news. Participant mediated data isn't the only solution to privacy issues, but it's an enticing one—and the more people who embrace it, the better a solution it will be.

Originally posted on the Open Science Collaboration blog. Reposted under Creative Commons.

View the complete collection of stories for Open Science Week.