A number of open questions in human evolutionary genetics would become tractable if we were able to directly measure evolutionary fitness. As a step towards this goal, we developed a method to examine whether individual genetic variants, or sets of genetic variants, currently influence viability. The approach consists in testing whether the frequency of an allele varies across ages, accounting for variation in ancestry. We applied it to the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort and to the parents of participants in the UK Biobank. Across the genome, we found only a few common variants with large effects on age-specific mortality: tagging the APOE ε4 allele and near CHRNA3. These results suggest that when large, even late-onset effects are kept at low frequency by purifying selection. Testing viability effects of sets of genetic variants that jointly influence 1 of 42 traits, we detected a number of strong signals. In participants of the UK Biobank of British ancestry, we found that variants that delay puberty timing are associated with a longer parental life span (P~6.2 × 10 −6 for fathers and P~2.0 × 10 −3 for mothers), consistent with epidemiological studies. Similarly, variants associated with later age at first birth are associated with a longer maternal life span (P~1.4 × 10 −3 ). Signals are also observed for variants influencing cholesterol levels, risk of coronary artery disease (CAD), body mass index, as well as risk of asthma. These signals exhibit consistent effects in the GERA cohort and among participants of the UK Biobank of non-British ancestry. We also found marked differences between males and females, most notably at the CHRNA3 locus, and variants associated with risk of CAD and cholesterol levels. Beyond our findings, the analysis serves as a proof of principle for how upcoming biomedical data sets can be used to learn about selection effects in contemporary humans.

Our global understanding of adaptation in humans is limited to indirect statistical inferences from patterns of genetic variation, which are sensitive to past selection pressures. We introduced a method that allowed us to directly observe ongoing selection in humans by identifying genetic variants that affect survival to a given age (i.e., viability selection). We applied our approach to the GERA cohort and parents of the UK Biobank participants. We found viability effects of variants near the APOE and CHRNA3 genes, which are associated with the risk of Alzheimer disease and smoking behavior, respectively. We also tested for the joint effect of sets of genetic variants that influence quantitative traits. We uncovered an association between longer life span and genetic variants that delay puberty timing and age at first birth. We also detected detrimental effects of higher genetically predicted cholesterol levels, body mass index, risk of coronary artery disease (CAD), and risk of asthma on survival. Some of the observed effects differ between males and females, most notably those at the CHRNA3 gene and variants associated with risk of CAD and cholesterol levels. Beyond this application, our analysis shows how large biomedical data sets can be used to study natural selection in humans.

Funding: Medical Research Council (Unit Programme number MC_UU_12015/2). This grant supported FRD and JRBP. National Institutes of Health (NIH) (grant number R01GM121372). This grant is to MP and JKP. National Institutes of Health (NIH) (grant number R01MH106842). This grant is to JKP. Columbia University. This research was funded in part by a Research Initiative in Science and Engineering grant to MP and JKP. National Institutes of Health (NIH) (grant number R01GM115889). This grant is to Guy Sella, provided partial support for HM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

As a proof of principle, we applied our approach to 2 recent data sets: to 57,696 individuals of European ancestry from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort [ 31 , 32 ] and, by proxy [ 33 – 35 ], to the parents of 117,648 individuals of British ancestry surveyed as part of the UK Biobank [ 36 ]. We did so for individual genetic variants then jointly for sets of variants previously found to influence 1 of 42 polygenic traits [ 37 – 40 ].

In line with Lewontin’s proposal to track age-specific mortality and fertility of hundreds of thousands of individuals [ 28 ], we sought a more direct and, in principle, comprehensive way to study adaptation in humans, focusing on current viability selection. Similar to the approach that Allison took in comparing frequencies of the sickle cell allele in newborns and adults living in malarial environments [ 29 ], we aimed to directly observe the effects of genotypes on survival by taking advantage of the recent availability of genotypes from large cohorts of individuals of different ages. Specifically, we tested for differences in the frequency of an allele across individuals of different ages, controlling for changes in ancestry and possible batch effects. This approach resembles a genome-wide association study (GWAS) for longevity yet does not focus on an end point (e.g., survival to an old age) but on any shift in allele frequencies with age. Thus, it allows the identification of possible nonmonotonic effects at different ages or sex differences. Any genetic variant that affects survival by definition has a fitness cost, even if the cost is too small to be effectively selected against (depending on the effective population size, the age structure of the population, and the age at which the variant exerts its effects [ 30 ]). Of course, a genetic variant can influence fitness without influencing survival through effects on reproduction or inclusive fitness. Our approach is therefore considering only 1 of the components of fitness that are likely important for human adaptation.

Because these approaches are designed (either explicitly or implicitly) to be sensitive to a particular mode of adaptation, they provide a partial and potentially biased picture of what variants in the genome are under selection. In particular, most have much higher power to adaptations that involve strongly beneficial alleles that were rare in the population when first favored and will tend to miss selection on standing variation or adaptation involving many loci with small beneficial effects (e.g., [ 24 – 27 ]). Moreover, even where these methods identify a beneficial allele, they are not informative about the components of fitness that are affected or about possible fitness trade-offs between sexes or across ages.

The statistical inferences rely on patterns of genetic variation in present-day samples (or, very recently, in ancient samples [ 4 ]) to identify regions of the genome that appear to carry the footprint of positive selection [ 2 ]. For example, a commonly used class of methods asks whether rates of nonsynonymous substitutions between humans and other species are higher than expected from putatively neutral sites in order to detect recurrent changes to the same protein [ 5 ]. Another class instead relies on polymorphism data and looks for various footprints of adaptation involving single changes of large effect [ 6 ]. These approaches detect adaptation over different timescales and, likely as a result, suggest quite distinct pictures of human adaptation [ 1 ]. For example, approaches that are sensitive to selective pressures acting over millions of years have identified individual chemosensory and immune-related genes (e.g., [ 7 ]). In contrast, approaches that are most sensitive to selective pressures active over thousands or tens of thousands of years have revealed strong selective pressures on individual genes that influence human pigmentation (e.g., [ 8 – 10 ]), diet [ 11 – 13 ], as well as sets of variants that shape height [ 14 – 16 ]. Even more recent still, studies of contemporary populations have suggested that natural selection has influenced life-history traits like age at first childbirth as well as educational attainment over the course of the last century [ 17 – 23 ].

A number of central questions in evolutionary genetics remain open, in particular for humans. Which types of variants affect fitness? Which components of fitness do they affect? What is the relative importance of directional and balancing selection in shaping genetic variation? Part of the difficulty is that our understanding of selection pressures acting on the human genome is based either on experiments in fairly distantly related species or cell lines or on indirect statistical inferences from patterns of genetic variation [ 1 – 3 ].

Results