Blockchains Won’t Fix the Problem with Genomics

If you’ve been waiting to get your genome sequenced, will a virtual token really make you change your mind?

This is shaping up to be the year of DNA for cryptocurrency. One startup after another is offering to pay you in bitcoin-like tokens for sharing your genetic data. The newest of these involves genetics researcher and publicity magnet George Church, who is part of a startup, Nebula Genomics, that will sequence your genome for around $1,000 and connect you to groups that want to buy access to it. You’d be paid in Nebula’s virtual tokens, and a record of which companies or researchers have accessed your genomic data would be stored on a blockchain.

But it’s hard to see how using blockchains and cryptocurrencies will substantially increase demand for genome sequencing. That’s a vexing problem because too few genomes have been sequenced and analyzed to generate as many meaningful insights as scientists had hoped. As recently as 2016, only 20 percent of whole-genome sequences were from non-white people, and without access to the full variation of human DNA, researchers don’t know enough about which variants actually cause or influence disease.

Another issue is that genomic data is near useless to researchers by itself. Its utility comes when paired with data about your phenotype — the traits and conditions you have. That information is found in places like your electronic health record or a medical survey. So whether these startups record transactions on a blockchain matters less than whether they will be able to gather and manage phenotypic data. It’s thornier to handle than DNA data; it’s messier and comes in many different types.

Nebula wasn’t the first company to hit upon the idea of using a blockchain as the basis for a genomic data marketplace, although it’s apparently the only one that will also sequence your genome for you. EncrypGen, Luna DNA, Longenesis, and Zenome also use blockchains to record how researchers and companies interact with your DNA data. But in these cases you bring a genetic readout that you’ve obtained from other companies, including those like 23andMe or Ancestry.com that don’t sequence full genomes.

Blockchains have strong privacy advantages for genomic data. A blockchain is essentially a public record that, in this case, would track every time your data is used. That record is nearly hack-proof because it is replicated in many different places within a network and anyone can access it. Drug companies or academic researchers couldn’t use your data without your knowledge. This is in stark contrast to the way much of your data is treated right now, says Polina Mamoshina, senior research scientist at Insilico Medicine, which is partnering with bitcoin miner BitFury to form Longenesis. “There is a hidden data market right now, and a lot of companies are selling our data, which is super valuable, without us knowing about it,” says Mamoshina.

“It’s hard to say whether people are going to do this.”

As an additional privacy protection, Nebula will also not allow anyone to download an individual’s genetic data. “A good analogy is they get to rent it; they get to run code on top of it,” says one of Nebula’s founders, Kamal Obbad. “All they get in return is the results.” He says that because the rented data would be fully anonymized, it would be impossible for, say, an insurance company to deny you coverage based on information they learn from your genome.

But using a blockchain and reaping its privacy benefits means paying people in a cryptocurrency. Paying them in dollars would require some central entity to serve as middleman and make the payments. Even aside from talk of a cryptocurrency bubble, Nebula’s tokens will only be useful within Nebula’s network, as researchers or companies who want to query Nebula’s database of genomic data pay for the access. In theory, this allows people who share their data to capture a fair portion of the value it helps to create. But if there aren’t enough buyers interested in using Nebula’s particular network, the tokens will be worth little or nothing. Meanwhile, you can instead get cash for sharing your genomic data with companies like DNASimple or Genos.

“It’s hard to say whether people are going to do this,” Mamoshina acknowledges. That’s why Longenesis also plans to store other data that might be useful to researchers even in the absence of DNA, including such phenotypic information as blood tests, data from wearables, and even selfies. Nebula also plans to collect not just genomes but also health surveys, electronic health records, and data from wearables, Obbad says.

These varying types of data are tricky to work with, though. “Integrating different kinds of data is a non-trivial challenge in machine learning and health care, and that’s true no matter how much data you have,” says Marzyeh Ghassemi, a visiting researcher at Google Verily and a postdoctoral researcher at MIT. Obbad agrees: “It’s a big challenge that hasn’t been solved yet.” In other words, the ultimate utility of these genetic marketplaces will probably depend on this open question of how well they’ll deal with information other than genomes. Genomes get the headlines but are comparatively easy to handle.

All the while, there’s the fact that we haven’t sequenced a diverse enough swath of genomes to have really great knowledge about what they mean for disease and human health. Blockchain genomic data marketplaces tout their ability to recruit new people to share genomic data, but they may just reach the same pool of people who are already interested in the subject. Ghassemi doubts that the use of a blockchain will encourage more diverse participation that better captures the actual variation in human genomes. “This is a hard problem,” she says, “and I don’t think it’s one that’s going to be addressed by purely technical solutions.”