A number of tools exist for interpreting these readouts, including GATK, VarDict, and FreeBayes. However, these software programs typically use simpler statistical and machine-learning approaches to identifying mutations by attempting to rule out read errors.

“One of the challenges is in difficult parts of the genome, where each of the [tools] has strengths and weaknesses,” says Brad Chapman, a research scientist at Harvard’s School of Public Health who tested an early version of DeepVariant. “These difficult regions are increasingly important for clinical sequencing, and it’s important to have multiple methods.”

DeepVariant was developed by researchers from the Google Brain team, a group that focuses on developing and applying AI techniques, and Verily, another Alphabet subsidiary that is focused on the life sciences.

The team collected millions of high-throughput reads and fully sequenced genomes from the Genome in a Bottle (GIAB) project, a public-private effort to promote genomic sequencing tools and techniques. They fed the data to a deep-learning system and painstakingly tweaked the parameters of the model until it learned to interpret sequenced data with a high level of accuracy.

Last year, DeepVariant won first place in the PrecisionFDA Truth Challenge, a contest run by the FDA to promote more accurate genetic sequencing.

“The success of DeepVariant is important because it demonstrates that in genomics, deep learning can be used to automatically train systems that perform better than complicated hand-engineered systems,” says Brendan Frey, CEO of Deep Genomics.

The release of DeepVariant is the latest sign that machine learning may be poised to boost progress in genomics.

Deep Genomics is one of several companies trying to use AI approaches such as deep learning to tease out genetic causes of diseases and to identify potential drug therapies (see “An AI-Driven Genomics Company Is Turning to Drugs”).