CRAIG VENTER, a biologist and boss of Human Longevity, a San Diego-based company that is building the world’s largest genomic database, is something of a rebel. In the late 1990s he declared that the international, publicly funded project to sequence the human genome was going about it the wrong way, and he developed a cheaper and quicker method of his own. His latest ruffling of feathers comes from work that predicts what a person will look like from their genetic data.

Human Longevity has assembled 45,000 genomes, mostly from patients who have been in clinical trials, and data on their associated physical attributes. The company uses machine-learning tools to analyse these data and then make predictions about how genetic sequences are tied to physical features. These efforts have improved to the point where the company is able to generate photo-like pictures of people without ever clapping eyes on them.

In a paper this week in Proceedings of the National Academy of Sciences, Dr Venter and his colleagues describe the process, which they call “phenotype-based genomic identification”. The group took an ethnically diverse group of 1,061 people of different ages and sequenced their genomes. They also took high-resolution, three-dimensional images of their faces, and measured their eye and skin colour, age, height and weight. This information was used as a “training set” to develop an algorithm capable of working out what people would look like on the basis of their genes.

Applying this algorithm to unknown genomes, the team was able to generate images that could be matched to real photographs for eight out of ten people. (This fell to a less impressive five out of ten when the test was restricted to those of a single race, which narrows facial differences.) About a year ago, using the same algorithm, the company produced a prediction of what your correspondent looked like at the age of 20 from her genome. The result can be compared below with a photograph of her at that age. Readers can judge for themselves if it is a reasonable likeness.

Critics immediately took to social media to dispute the findings. Jason Piper, a former employee of Human Longevity, argued that “because everyone looks close to the average of their race, everyone looks like their prediction”. One thing in Dr Venter’s favour, however, is that the findings are based on a relatively small group of people. With machine-learning techniques, the larger the set of data the better the results; working with tens of thousands of genomes could well improve the prediction rate.

Creating pictures of people’s faces from their genomes has a number of potential uses, especially in forensic science. It might be possible to reconstruct the face of a perpetrator from any genetic material they have left behind, such as blood or body fluids. That would allow police to “see” the face of suspects in cases of murder, assault and rape. It could also help with identifying unrecognisable victims who have been burned or maimed. Unsolved cases might be reopened if suitable samples were still available.

As Dr Venter is quick to point out, this technology has other implications, among them for privacy. He considers that genomic information must now be treated as personal information, even if it is presented as an anonymised sequence of letters—as is currently the case in some countries. It will, he warns, be possible to construct a face from the limited genetic data that people currently post online, for example, from DNA-testing services such as 23andMe.

This in turn raises the possibility that people may no longer be willing to have their genetic information included in public sequencing efforts, even though such work can help combat diseases. If facial projections can be made from genomes, then someone’s appearance could subsequently be matched to real online photographs. This might mean that people’s genetic sequences, and all their flaws, could be connected to their identity in public.

The connection between genes and faces can work both ways. Just as genomes can be used to build up a picture of faces, so facial features are able to reveal genetic diseases. It is reckoned that 30-40% of genetic diseases cause changes to the shape of the face or skull, allowing, in some cases, experienced doctors to diagnose a condition simply by looking at a patient’s face. So why not train an app to do that?

Face healer

Companies already are. Face2Gene is a smartphone app developed by FDNA, a startup based in Boston co-founded by Moti Shniberg and Lior Wolf. Mr Shniberg’s previous venture was bought by Facebook to develop the photo-tagging feature that identifies people in pictures uploaded to the social-media site. The FDNA app allows a doctor to snap a picture of a patient, upload it to the internet (along with the patient’s height, weight and clinical data) and let the firm’s algorithm produce a list of possible diseases from its online database. The app can access information on 10,000 diseases; facial recognition works for 2,500 of them, so far.

Each diagnosis comes with a probability score that reflects the chances of the app being correct. It also lists any genetic mutations known to cause the disease, which can help with an analysis of a patient’s condition. Dekel Gelbman, FDNA’s chief executive, estimates that the app is being used by doctors and researchers in 130 countries. The patients’ data are stored securely, anonymised and encrypted.

As with Dr Venter’s work, the deeper the pool of data available to facial researchers, the more valuable it becomes. Christoffer Nellaker of the University of Oxford has set up a website called “Minerva & Me”, where both the healthy and those with diseases can upload pictures of themselves and provide consent for their images to be used for studies. He is also setting up a network, the Minerva Consortium, to encourage artificial-intelligence researchers to share their data.

Maximilian Muenke of the National Human Genome Research Institute in Bethesda, Maryland and Marius Linguraru of the Children’s National Health System in Washington, DC, and their colleagues are trying to broaden things out, too. They have published a series of studies using facial-recognition algorithms that were trained with photos of African, Asian and Latin American patients to identify different genetic diseases with accuracies of more than 90%. In many poor countries, expensive antenatal tests to identify genetic diseases are not available. A baby with Down’s syndrome, for example, is usually identified before birth in Europe and America, but in poor countries many are not diagnosed before they are a year old. The researchers intend to produce an app that will help doctors to identify dozens of the most common syndromes using a smartphone.

Dig deeper

Leader:

What machines can tell from your face

Science and technology:

Advances in AI are used to spot signs of sexuality

Business:

Ever better and cheaper, face-recognition technology is spreading