Genomes are so five minutes ago. Personalized medicine is all about phenomes now.

OK, that’s an exaggeration. But plenty of genetic disorders do result in distinctive facial phenotypes (Down syndrome is probably the best known example). Many of these disorders are quite rare and thus not easily recognized by clinicians. This lack of familiarity can cause the patients with the disorders (and their parents) to endure a long and traumatic diagnostic odyssey before they figure out what ails them. While they may be uncommon individually, in aggregate, these rare disorders are not that rare: they affect eight percent of the population.

FDNA is a genomics/AI company that aims to “capture, structure and analyze complex human physiological data to produce actionable genomic insights.” They’ve made a facial-image-analysis framework, called DeepGestalt, that can diagnose genetic conditions based on facial images with a higher accuracy than doctors can. Results are published in Nature Medicine.

To train its algorithm, the company relied on a data set of 500,000 facial images of 10,000 subjects culled from the Internet. When this data set was compiled back in 2014, it was larger than any known similar data set except for Facebook’s privately held one.

They then tested it by seeing how well it could identify faces of people with one particular genetic disorder when they were mixed in with faces of people with several other disorders—a situation a clinician or genetic counselor could very feasibly find herself in. They did two tests of this type, one with Cornelia de Lange syndrome and the other with Angelman syndrome. Both are developmental disorders with cognitive and motor impairments. In both cases, DeepGestalt achieved accuracies above 90 percent—better than experts, who were closer to 70 or 75 percent accurate.

Another test examined if DeepGestalt could distinguish between a small pool of people with the same disorder but different genotypes by showing it images of people with Noonan syndrome, which has a variable impact depending on which of five different genes is mutated. It only achieved 64 percent accuracy this time, but that’s better than the 20 percent predicted by chance. Especially since “two dysmorphologists concluded that facial phenotype alone was insufficient to predict the genotype.”

The last test was to diagnose hundreds of images of faces spanning 216 different disorders. It was 90 percent accurate.

The algorithm works by cropping the face into multiple regions, assessing how much each region corresponds to each syndrome, and then aggregating the regions to see which syndrome is the best fit. Hence Gestalt. But the authors note that “DeepGestalt, like many artificial intelligence systems, cannot explicitly explain its predictions and provides no information about which facial features drove the classification.”

It’s a black box; it can surpass experts in making a genetic diagnosis based on phenotype, but it can’t teach them how to do what it does.

Nature Medicine, 2019. DOI: 10.1038/s41591-018-0279-0 (About DOIs).