Vice President Joe Biden has promised to “break down silos and bring all the cancer fighters together.” | Getty Biden’s cancer bid exposes rift among researchers

Joe Biden’s proposal for a cancer moon shot has struck a deep nerve in the research community, where cutting-edge scientists blame an entrenched medical establishment for hoarding the data needed to make breakthroughs.

Biden, whose son Beau died of brain cancer in May, said earlier this month that vast troves of research were “trapped in silos, preventing faster progress and greater reach to patients.” While few researchers disagree, many are still reluctant to share the raw data used in their research, posing big obstacles to the vice president’s initiative.


The tension boiled over this month when Jeffrey Drazen, editor of the New England Journal, and co-author Dan Longo, wrote in an op-ed that while sharing was all well and good, it had to be done collaboratively, not by “data parasites” who stole or misused work that might have taken bench scientists decades to assemble. The editorial did not mention Biden’s initiative, but many commenters noted its relevance.

Over a snowbound weekend, the Twittersphere exploded with angry attacks on the Journal, which gave the impression of an ivory tower beset by flame-throwing iconoclasts. Geneticist Michael Eisen, at the University of California, Berkeley, decried the editorial (which Drazen toned down four days later) as “one of the most shockingly anti-science things ever written.”

The debate, which revolves around how fast researchers should have to share results from government-funded clinical trials, aired biomedicine’s dirty laundry in public.

“Big data” has produced breakthroughs but it has also put some traditional researchers on the defensive. So have reports showing that most research papers—the bread and butter of careers—yield results that can’t be reproduced. For many genetic studies in particular, the only way to get reliable, replicable results is to contribute to studies that amass huge amounts of data—and thereby surrender the glory of publishing alone.

That’s not an easy sell to those in the medical trenches. Researchers need to publish original articles to advance their careers. In addition, their institutions are encouraged to monetize the information under the 1980 Bayh-Dole Act, which encouraged universities to commercialize discoveries.

Committees for ethical review and data management also get in on the act to protect the rights of patients and scientific integrity. Often, the process just looks like a waste of time, producing red tape and silos that threaten to block the administration’s efforts.

Some kind of incentive may be required to cut them loose, said John Wilbanks of Sage Bionetworks, a nonprofit that supports open science projects.

“If they’re serious about a moon shot, they have to advocate for the creation of a new system that doesn’t have to fight all these various power structures,” Wilbanks said.

Biden has promised to “break down silos and bring all the cancer fighters together.” But he has not yet offered specifics about how to do that.

"Biden is absolutely right to focus on robust data sharing as a key tenet of cancer research,” says David Shaywitz, chief medical officer of DNAnexus, a biotech company. “Leading academic cancer centers are clamoring for government money to subsidize sequencing projects, but unless this funding is explicitly coupled to actual data sharing, we'll wind up enlarging existing silos rather than leveling them."

“Data sharing is in the Zeitgeist—that’s why Biden said it,” added Wilbanks. But to force the transition to easier data handover, it will be necessary for the government to attach strings to its grants, he said.

A lot of scientists disagree, including those in the private sector.

“I don’t think that kind of mandate will work,” said Brad Fenwick, a vice president at the academic publisher Elsevier, most of whose journals require high fees to read. He said such ideas reflect a “lack of appreciation of the difference between disciplines. Everyone sees the world from where they sit.”

Most academic researchers “perceive very little upside in generously and richly sharing their raw data,” Shaywitz acknowledges. “At a minimum, it’s regarded as a thankless hassle.”

“At what point is data so free that you might wake up and find some part of your long, arduously created trial published by someone you’ve never heard of?” says Clifford Hudis, chief advocacy officer at Memorial Sloan-Kettering Cancer Center.

There are scientific risks to sharing data as well. Hidden elements in a dataset – details, for example, about a study population that the original researcher understood but didn’t communicate in a publication—add mistakes when crunched into multiple sets, says biostatistician Donald Berry of MD Anderson Cancer Center in Houston.

NIH typically requires data from clinical trials it funds to be published within a year of the completion of the trial. Releasing it before it’s collected can bias investigators, said Walter Kibbe, director of bioinformatics at the National Cancer Institute.

But to meet the administration’s goal of a personalized approach to disease, patients’ genetic variations will have to be matched against large databases of people with the same diseases. That isn’t possible when it takes months to access the data, especially if the patient is gravely ill with cancer or some other condition.

“Everything we do in the rare disease space totally relies on data sharing,” says Daniel MacArthur, a genomics expert with appointments at Massachusetts General Hospital and the Broad Institute of MIT and Harvard.

Scientists describe some of the barriers as cultural differences that are generational, or entrenched in particular scientific disciplines.

“The clinical trialists like [the New England Journal’s] Drazen are basically in a totally different universe” from data scientists, says cancer genome expert Michael Hoffman of the University of Toronto. “The sky doesn’t fall because people use my research files without listing me as a co-author on their paper.”

“’Research parasites’ is a rather hilarious thing to say,” adds Jeremy Leipzig, a bioinformatics software developer at The Children’s Hospital of Philadelphia. “A statistician sitting in India might have nothing to do with your study, but have an excellent way of analyzing data that gives us terrific insight into the etiology of a disease.”

As an example of problems with sharing, researchers cite NIH’s Database of Genotypes and Phenotypes, created in 2006 to archive and distribute biomedical data generated in NIH-funded studies. Scientists complain that it takes months to learn whether they can access files, and sometimes the answer is ‘no’ without explanation.

It’s not entirely the government’s fault. Each institution that donates data to the site has its own rules about using them, rules that vary from study to study. Technical problems can make it hard to upload vast files. The NIH crew that manages the database seems underfunded and overworked, said Leipzig.

Despite these factors, fewer than a third of those requesting access to the database have been refused, notes Laura Lyman Rodriguez, chief of the policy office at the NIH’s genome institute.

She and others interviewed for this article see a trend toward more data sharing, and hope Biden’s cancer push will speed it along.

Groups that started work in the 1990s on the federally backed Human Genome Project had a head start dealing with big collaborations and terabytes of data. Now, every biomedical lab needs a computer scientist and a statistician to stay on the cutting edge, said Louis Staudt, who leads the Center for Cancer Genomics at NCI.

The “dataheads” enjoy the support of patient groups, which are pressing scientists to share their results early and often, so the best knowledge can get to doctors faster.

The scientific value of the data, and the push for patient participation, are “shifting the conversation,” Rodriguez said.

Obama used a February 2013 executive action to require public access to data created with public funding. The feds, cancer centers and advocacy groups are building collaborative projects that make it easier for clinicians and researchers to keep abreast of the latest findings.

“Biden’s onto something,” said Leipzig of Children’s Hospital. “He’s right. The more we can to encourage data sharing, the faster science will progress.”