Genetic data on half a million Brits reveal ongoing evolution and Neanderthal legacy

Neanderthals are still among us, Janet Kelso realized 8 years ago. She had helped make the momentous discovery that Neanderthals repeatedly mated with the ancestors of modern humans—a finding that implies people outside of Africa still carry Neanderthal DNA today. Ever since then, Kelso has wondered exactly what modern humans got from those prehistoric liaisons—beyond babies. How do traces of the Neanderthal within shape the appearance, health, or personalities of living people?

For years, evolutionary biologists couldn't get their rubber-gloved hands on enough people's genomes to detect the relatively rare bits of Neanderthal DNA, much less to see whether or how our extinct cousins' genetic legacy might influence disease or physical traits.

But a few years ago, Kelso and her colleagues at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, turned to a new tool—the UK Biobank (UKB), a large database that holds genetic and health records for half a million British volunteers. The researchers analyzed data from 112,338 of those Britons—enough that "we could actually look and say: ‘We see a Neanderthal version of the gene and we can measure its effect on phenotype in many people—how often they get sunburned, what color their hair is, and what color their eyes are,’" Kelso says. They found Neanderthal variants that boost the odds that a person smokes, is an evening person rather than a morning person, and is prone to sunburn and depression.

Kelso is one of many researchers who are turning troves of genetic and medical data on living people into windows on human evolution. In addition to unearthing archaic DNA, the studies are pinpointing genes that natural selection may now be winnowing out of the gene pool and other genes—for example those linked to fertility—that it may be favoring. Among the most fruitful of the data sources is the UKB, which makes its data accessible to researchers, no matter where they are and what their field. Its giant database is "a magical new resource that [will] help us answer a whole bunch of hard questions we're struggling with now because all of the data has been under lock and key," says population geneticist Jeremy Berg, a postdoc at Columbia University. "It is a step beyond other databases."

For the UKB architects, who designed it for biomedical research, the evolutionary discoveries are an unexpected bonus. "No one was thinking about Neanderthal traits when we designed the protocol," says molecular epidemiologist Rory Collins of the University of Oxford in the United Kingdom, who is principal investigator of the UKB. "The experiment [is] working well beyond people's expectations."

No one was thinking about Neanderthal traits when we designed the protocol. Rory Collins, UK Biobank principal investigator

Neanderthals sneaked into the UKB in 2013, when Harvard University population geneticist David Reich was in Oxford to give a talk. His host, Oxford geneticist Peter Donnelly, was overseeing the design of chips to identify genes of interest in blood samples like those in the UKB. Donnelly asked Reich whether he'd like to add Neanderthal variants to a custom chip used to genotype the UKB participants; that would allow Reich and others to fish for rare Neanderthal variants in half a million people. "David was very enthusiastic," Donnelly recalls.

Soon after, Reich and his postdoc, Sriram Sankararaman, emailed Donnelly a wish list of variants to add to the chip: 6000 relatively rare alleles likely to come from Neanderthals. Their calculations suggested the UKB was big enough to include enough carriers of these variants so researchers could probe the function of the genes. "Imagine 1% of the population has a Neanderthal variant," says Sankararaman, now a computational geneticist at the University of California (UC), Los Angeles. "If you're looking at half a million people, you're looking at enough copies of that variant in enough individuals [5000] so you can detect subtle effects."

At the same time, computational biologist Tony Capra at Vanderbilt University in Nashville had the same bright idea to search for Neanderthal DNA in a large database. He used proprietary electronic records of 28,000 Americans. His team was the first to publish, reporting Neanderthal DNA variants that raise the risk of depression, skin lesions, blood clots, and other disorders in people today. Inspired by Capra's study, Kelso jumped in, becoming the first to use UKB data to publish Neanderthal gene variants in living people. Her results suggest that although some Neanderthal gene variants may have been optimal for active lives outdoors in prehistoric Europe, they may be problematic for people now, who live mostly indoors in artificial light and get less exercise.

Groups led by Kelso and Sankararaman are now looking for links between Neanderthal DNA and traits in genotyped data from 500,000 people—the total UKB data set, which was released in July 2017. Already, they are learning that Neanderthal alleles help cause baldness and mental illness and boost certain immune functions, Sankararaman says. Meanwhile, another team has found variants that help explain why modern humans' heads are round, in contrast to the elongated, football-like shape of Neanderthal skulls. Those researchers plan to combine forthcoming MRI brain scans of 100,000 UKB participants with genetic data to probe the genetic basis of brain differences between us and our extinct cousins.

Capra says when it comes to scanning and understanding DNA from Neanderthals, the UKB cohort offers even more analytical power than the medical databases he used, because it covers "a broader range of psychiatric and lifestyle traits." Those rich data have also made the UKB a hunting ground for clues to evolutionary changes that have shaped people's genomes in the past few generations—and may even be doing so today.

A few years ago, Molly Przeworski of Columbia University and Joe Pickrell of the New York Genome Center in New York City met for lunch near Columbia's campus. Talk turned to aging and Alzheimer's disease. Pickrell had been writing a blog, where he had discussed studies showing that between the ages of 70 and 85, carriers of the ApoE4 allele, which boosts the risk of Alzheimer's and cardiovascular disease, died at about twice the rate of noncarriers. The pair wondered whether other gene variants affect survival so dramatically—and whether natural selection is weeding them out.

When it comes to natural selection in humans, most studies have only been able to detect dramatic cases thousands or millions of years ago in genes of known function. Now, Pickrell and Przeworski wondered whether they could detect genetic variants that affect survival today—and whether natural selection in recent generations has been weeding out harmful ones or favoring beneficial ones.

To do this, they realized they'd need data on DNA as well as on traits like participants' age at death. For statistical confidence, they'd need a giant sample size—at least 100,000—to detect how the frequency of common alleles varied in people of different ages. Databases like the UKB were the answer. "We suddenly realized that the some of these databases were large enough to let us study selection in contemporary humans," Przeworski says.

They soon got access to genetic and health data on 57,696 people in the Resource for Genetic Epidemiology Research on Aging database at Kaiser Permanente in Oakland, California, and 117,648 individuals in the UKB's 2015 data release. They sorted participants into 5-year age intervals, and looked at the frequency of many alleles, including ApoE4, in each age group, as well as how the variants correlated with 42 traits potentially associated with early death or long life, such as cardiovascular disease, cholesterol levels, asthma, age at puberty, and menopause.

Nearly all the variants they examined persisted at the same frequency even into old age, suggesting they had no large effect on survival. That implies natural selection has efficiently weeded out harmful variants, even if they act only in old age—perhaps, Przeworski speculates, because the variants curb older men's fecundity. Or perhaps the hypothesized benefit that healthy grandmothers confer on grandchildren was at work.

The researchers did find two genes that suddenly became rare at older ages, suggesting they were harmful. One was ApoE4: As expected, fewer carriers—especially women—lived past age 80. Also, fewer men with a variant of the CHRN3 gene that makes it harder to quit smoking survived past the age of 75 than did men without the variant.

The researchers concluded that natural selection has not yet had time to eliminate these two alleles, perhaps because changes in the environment and human behavior only recently made them deadly. For example, the CHRN3 allele wouldn't have affected survival until many men were smoking. And women who were more active in the past might have been less vulnerable to the cardiovascular diseases caused by ApoE4, Przeworski speculates.

The researchers spotted another intriguing pattern. Genetic variants that lead to early puberty also became rarer in older age groups. Natural selection may have preserved those variants even though they shorten life span because they also boosted fertility.

The UK Biobank allows us to show that natural selection not only took place in the past, but it's still ongoing. Peter Visscher, University of Queensland

A long life, though, is much less important to evolution than fertility. When it comes to the game of evolution, in fact, the person who has the most kids wins by passing on the most genes. With the advent of birth control, people in industrial societies have more control than ever over their own fertility—but new studies zeroing in on the genes underlying fertility show the forces of selection may still be at work.

Multiple studies have suggested that when food sources became more reliable in industrialized societies, women began to mature faster, weigh more, give birth to their first child earlier, and enter menopause later—all traits possibly linked to having more babies. But researchers have been unable to tie those trends to underlying genes to get direct evidence of natural selection. Quantitative geneticist Peter Visscher and his colleagues at the University of Queensland in Brisbane, Australia, realized they could use the UKB to see firsthand which gene variants underlie those traits in people today, and whether they are really linked to fertility.

They searched the UKB's full cohort for people who had the most babies to see what traits they share, and what genes correlate with those traits. They documented the number of live births for women over age 45 and men over age 55. Then, they analyzed traits in women and men that might have influenced fertility, such as age of first birth, age of menopause, height, weight, body mass, blood pressure, and education. They found 23 traits in women and 21 in men linked to having more children. Not surprisingly, mothers who gave birth early and had late menopause—and therefore had a longer reproductive span—were more fertile. So were women who were heavier and shorter, perhaps because shorter bodies are more energy efficient, leaving a bigger reserve for pregnancy and nursing.

Visscher and his colleagues then set out to identify the genetic basis of these fertility-linked traits. They analyzed data from 157,807 of the women and 115,902 of the men. As predicted, they found that the most fertile women had higher frequencies of alleles that tend to make them shorter and heavier. In men, greater fertility was associated with more alleles that contribute to a higher body mass index and hand-grip strength. That suggests men with genes that make them taller and bulkier have more kids than sedentary types, whether because of female choice, some health-related reason, or the men's own preference.

Not all traits linked to fertility are physical or likely to have a big genetic component: Among women who had their first child later in life, those who had more education and did better on an intelligence test had more babies. This may be because better-educated couples tend to be wealthier and can afford more children.

But the fact that genes linked to traits thought to increase fertility are indeed more common in fertile people backs up the idea of recent selection on our genomes, even as both the environment and humans' preferences for mates and families are changing. "The UK Biobank allows us to show that natural selection not only took place in the past, but it's still ongoing," Visscher says.

[All these studies have generated] huge buzz among evolutionary biologists about how biobanks can provide very deep information about the genetics of different populations and their evolution. Janet Kelso, Max Planck Institute for Evolutionary Anthropology

Teasing out natural selection from other factors shaping genes can be tricky, however, especially when multiple genes work together to influence complex traits, such as height. About 5000 gene variants simultaneously influence a person's height, some boosting it, some reducing it, says Jian Yang, a statistical geneticist at the University of Queensland. The UKB's huge database allows researchers to find new variants and explore their impact and origins.

Using other databases, researchers had found that the number of genes that contribute to tallness in Europeans increased on a cline from south to north. Many researchers, including Berg, had concluded that northern Europeans had inherited those genes from an ancient migration—that of the Yamnaya herders who migrated from the Eurasian steppe to central Europe about 4000 years ago. Berg and others suggested natural selection had favored tallness in the Yamnaya or their ancestors, and ancient DNA reveals that the Yamnaya were tall.

But now, with UKB data, population geneticist Graham Coop of UC Davis and his colleagues, including Berg, are challenging that finding. In a bioRxiv preprint posted in June 2018, they analyzed genetic and height data on 500,000 people from the 2017 UKB data release. With so many people from similar backgrounds, the researchers could identify more height alleles, as well as note differences in diet, disease, and the environment. They found that northerners had no more tall variants than southerners.

"It's true people in northern Europe are taller on average, but there is no evidence this has anything to do with natural selection," Berg says. He speculates that northerners' height might be an environmental effect, perhaps from a diet richer in protein, or from fewer childhood or prenatal illnesses.

Although UKB data cast doubt on natural selection's role in that case, they do suggest that evolution has favored genes for shortness in pygmy populations on the island of Flores in Indonesia. Visscher and colleagues scanned the DNA of Flores people for genes the UKB had linked to short stature. They found that Flores pygmies carry more such gene variants than their closest relatives in New Guinea and East Asia, suggesting evolution favored genes for shortness on the island. All these studies have generated "huge buzz among evolutionary biologists about how biobanks can provide very deep information about the genetics of different populations and their evolution," Kelso says.

She hopes to work with researchers designing databases in Africa and Asia to identify archaic DNA in those populations. Thanks to the success of the Neanderthal work, many researchers are eager for data from Melanesians, because they have inherited traces of DNA from Denisovans—the mysterious cousins of Neanderthals who lived in Siberia more than 50,000 years ago. "That would be amazing, to get Denisovan DNA from more living people [in biobanks]. That's our dream," Kelso says.