Epidemiological studies suggest that educational attainment is affected by genetic variants. Results from recent genetic studies allow us to construct a score from a person’s genotypes that captures a portion of this genetic component. Using data from Iceland that include a substantial fraction of the population we show that individuals with high scores tend to have fewer children, mainly because they have children later in life. Consequently, the average score has been decreasing over time in the population. The rate of decrease is small per generation but marked on an evolutionary timescale. Another important observation is that the association between the score and fertility remains highly significant after adjusting for the educational attainment of the individuals.

Epidemiological and genetic association studies show that genetics play an important role in the attainment of education. Here, we investigate the effect of this genetic component on the reproductive history of 109,120 Icelanders and the consequent impact on the gene pool over time. We show that an educational attainment polygenic score, POLY EDU, constructed from results of a recent study is associated with delayed reproduction (P < 10 −100 ) and fewer children overall. The effect is stronger for women and remains highly significant after adjusting for educational attainment. Based on 129,808 Icelanders born between 1910 and 1990, we find that the average POLY EDU has been declining at a rate of ∼0.010 standard units per decade, which is substantial on an evolutionary timescale. Most importantly, because POLY EDU only captures a fraction of the overall underlying genetic component the latter could be declining at a rate that is two to three times faster.

Epidemiological studies have estimated that the genetic component of educational attainment can account for as much as 40% of the trait variance (1). Recent meta-analyses (2, 3) yielded sequence variants contributing to the underlying genetic component. A negative correlation between educational attainment and number of children has been observed in many populations (4⇓⇓–7). A recent study of ∼20,000 genotyped Americans born between 1931 and 1953 provided direct evidence that the genetic propensity for educational attainment is associated with reduced fertility (8, 9), supporting previously postulated notions (10) that the population average of the genetic propensity for educational attainment and related traits must be declining. Here, using a population-wide sample that is both much larger and covers a substantially greater time span, and with additional auxiliary information, we aim to estimate the change of the genetic propensity of educational attainment in the Icelandic population over the last few decades, starting with an in-depth investigation of the relationship between a measurable genetic component of educational attainment and various aspects of reproduction (11⇓⇓–14).

Results

The number of living Icelanders is ∼317,000 (Fig. S1). A genealogical database of Icelanders (15⇓–17) that is very close to complete for individuals born after 1910 (Materials and Methods) is used in this study. Probands used for the genetic analyses here are limited to those with both parents and all four grandparents listed in the genealogy. For the fertility studies, only children who survived their first year are counted. The first step was to use results from a recent genome-wide association study (GWAS) of educational attainment (3) to determine the per-locus allele-specific weightings of 620,000 markers used to calculate a polygenic score (18, 19), POLY EDU (Materials and Methods for details on polygenic score construction). After excluding the Icelandic cohorts in the GWAS to avoid confounding, 278,948 samples from 62 cohorts were used to determine the weightings for POLY EDU . We computed POLY EDU for over 150,000 Icelanders who were directly genotyped with chip arrays and imputed for additional sequence variants discovered through whole-genome sequencing of 8,453 Icelanders (20) (Materials and Methods). POLY EDU was scaled to an SD of 1, hereafter referred to as standard units (SUs). When applied to 46,079 Icelanders with educational attainment data POLY EDU was found to explain 3.74% of the trait variance (P < 10−300). By contrast, the strongest single variant only explains 0.10% of the variance, indicating that educational attainment is a complex trait influenced by many variants in the genome and highlighting the increased power of using the polygenic score for our analyses. Our first analysis focused on 109,120 individuals (58,560 females and 50,560 males) with year of birth (yob) between 1910 and 1975 (Fig. S2). The genealogical database was used to obtain the number of children (NC) and, where applicable, the age at first child (AGFC) and the average age at child birth (AACB) for this set. The estimated effects of POLY EDU on these reproductive traits, adjusted for yob and 20 principal components (21), are presented in Table 1 for females and males separately. For females, an increase of 1 SU of POLY EDU corresponds to an average decrease of 0.084 children [P = 1.0 × 10−43, calculated with genomic control adjustment (22)], and for those with children AGFC and AACB increased by 0.59 years (P = 5.3 × 10−155) and 0.46 years (P = 1.0 × 10−117), respectively. A similar, albeit weaker, pattern of results was observed for males. The finding of a substantially stronger association for AGFC than NC suggests that the effect of POLY EDU on NC is mainly manifested through delayed reproduction. Thus, for females with children, the association between AGFC and POLY EDU remains highly significant (P = 2.9 × 10−118) after adjusting for NC, whereas the association between NC and POLY EDU is not significant (P = 0.17) after adjusting for AGFC. This led us to examine the effect of POLY EDU on NC[x], the number of children a proband had at or after age x, as a function of x. The results are presented in Fig. 1. At x = 14, the estimated effect on NC[x] per SU of POLY EDU , denoted by eff[x], is −0.084 for females and −0.054 for males. These correspond to results in Table 1 because none of the probands here had children before 14 years of age. As x increases, the estimated effect becomes less negative and is essentially zero at 22 for females and 23 for males. In other words, if children born to mothers at 21 years of age or younger (18% of all children counted here, Fig. S3) and children born to males at 22 or younger (13% of all children counted here) are ignored, there is no correlation between NC and POLY EDU . As x increases further, eff[x] becomes positive and continues to increase until x = 30 for females and starts to drop slowly to zero after that. Note that the difference eff[x] − eff[x + 1] corresponds to the estimated effect of POLY EDU on children born to the proband at precisely age x. Thus, for age x > 30, females with higher POLY EDU tend to have more children than those with lower POLY EDU , whereas the reverse is true for x < 30. Having more children after 30 (P < 1 × 10−15) compensates for having fewer children between 22 and 30 years of age but does not compensate for the reduced number of children at age 21 years and younger. Similar results apply to the males with the age boundaries shifting 1 to 2 years upward. The negative effect of POLY EDU on NC is less for males than for females, and the difference is mainly accounted for by children born to them at 19 years or younger. The analyses performed using POLY EDU maximize statistical power, but the effects on fertility traits can also be seen with individual variants. Results for 120 SNPs that are genome-wide significant (P < 5 × 10−8) in the meta-analysis for educational attainment excluding Icelandic data (Materials and Methods) are given in Table S1 and Figs. S4 and S5. For example, 35 of the 120 SNPs have associations with AGFC of females that are in the same direction and nominally significant (one-sided P < 0.05). The minor allele of one of these SNPs, rs192818565, is associated with reduced education. It is known to tag the H2 haplotype of a common inversion on chromosome 17 that was shown to exhibit characteristics consistent with having been positive-selected (23). It has subsequently been shown that H2 is also associated with reduced intracranial volume (24, 25) and neuroticism (26). Combining our male and female data, the minor allele of rs192818565 is significantly associated with more children (P = 5.2 × 10−3) and having children earlier (P = 2.2 × 10−3). This is thus a striking case where a variant associated with a phenotype typically regarded as unfavorable could nonetheless be also associated with increased “fitness” in the evolutionary sense.

Table 1. Estimated effects of POLY EDU on fertility traits

Fig. 1. Effect of POLY EDU on number of children with lower bound for age. Blue, males; red, females; error bars indicate plus/minus 1 SE. Estimated effect calculated by only counting children born to the proband at or after a certain age (the x axis).

Fig. S1. Number of living Icelanders by year.

Fig. S2. Total number of Icelanders and number in our fertility study by birth years.

Fig. S3. Distribution of age of child birth. For our fertility study, this shows the percentage of children born to the parent at a specific age of the (A) father and (B) mother.

Table S1. Associations between 120 genome-wide significant markers and three reproductive traits

Fig. S4. Associations between 120 genome-wide significant SNPs and three reproductive traits for females. x axis: Z metaedu = z-score from the educational attainment meta-analysis. y axes: z-scores of associations between each of the variants and the three reproductive traits. ±1.645 correspond to one-sided P = 0.05.

Fig. S5. Associations between 120 genome-wide significant SNPs and three reproductive traits for males. Labels as in Fig. S4.

Among the genotyped individuals with yob between 1910 and 1975, information about educational attainment is available for 25,794 females and 19,903 males. For these individuals, the effects of POLY EDU and educational attainment (EDU) itself on the reproductive traits were estimated individually, through separate regressions, and jointly, through regressions including both as predictors (Table 2). We coded EDU as in a recent meta-analysis (3). Individuals fall into four categories: 10, 13, 15, and 20 years (mean = 14.0 and SD = 3.4 for males and mean = 13.4 and SD = 3.7 for females). The first category corresponds to the mandatory minimum education in Iceland and the last corresponds to a college degree. For females, when analyzed separately, each SU increase of POLY EDU decreases expected NC by 0.097 (P = 1.7 × 10−23), whereas each year increase in EDU corresponds to a reduction of 0.045 (P = 5.0 × 10−56). When analyzed jointly, the estimated effect of POLY EDU on NC adjusted for EDU reduces to −0.071, a shrinkage that is meaningful but not drastic, and remains highly significant (P = 7.2 × 10−13). Similar results were observed for AGFC and AACB. Clearly, EDU here is not a complete measure of educational attainment (e.g., it does not include information on postcollege education). With a more comprehensive measure of educational attainment, the estimated effects for POLY EDU upon adjustment might shrink further, but the changes are unlikely to be drastic. For example, limiting to females with 10 years of education (n = 11,055), the estimated effect of POLY EDU on NC is −0.079 (P = 5.8 × 10−6) (Table S2). These results indicate that POLY EDU has a direct effect on reproduction that is independent of the amount of education that is actually attained. Crucially, these results indicate that the magnitude of selection acting on the underlying genetic component of educational attainment has to be estimated directly using genotype data and could be severely underestimated if one attempts to deduce it based solely on the observed negative correlation between educational attainment and fertility. For males, the results tend to be similar to those of the females, only weaker. There is one striking exception. High EDU, similar to having a high POLY EDU , delays reproduction. However, high EDU, unlike high POLY EDU , does not lead to having fewer children for males (27). Indeed, in the joint analysis, the estimated effect of POLY EDU is 0.061 fewer children (P = 2.5 × 10−7), whereas the estimated effect per year of EDU is 0.011 children more. This again highlights that the effect of POLY EDU on reproduction is not simply manifested through educational attainment.

Table 2. Estimated effects of POLY EDU and EDU on fertility traits

Fig. S6. Distributions of educational attainment for males and females. The first panel includes all samples studied. The second and third panels show, for males and females, respectively, how distributions of educational attainment change over time.

Table S2. Associations between POLY EDU and three reproductive traits stratified by four EDU categories

For 129,808 genotyped individuals born between 1910 and 1990 POLY EDU shows a notable and highly significant decline with yob (−0.0182 SU per decade, P = 5.8 × 10−35). Average polygenic scores calculated for 10-year bins are displayed in Fig. 2. The relationship between POLY EDU and yob exhibits nonlinear behavior (i.e., the downward slope seems to be steeper in the earlier years). When a quadratic fit was performed (blue line), the quadratic term of yob is significant (P = 1.7 × 10−3). A closer examination suggests that the nonlinear behavior mainly reflects a survival effect rather than a birth cohort effect. The samples studied here were collected between 1998 and 2014, with a majority (68%) ascertained before 2006. For 85,520 of the latter, survival data at 2016 are available. The death rate overall is 19.4% (16,610/85,520) and is 54.5% (13,954/25,610) for those with yob before 1940, compared with 4.4% (2,656/59,910) for those with yob ≥ 1940. After adjustment for sex, yob, and age at ascertainment, each SU of POLY EDU is estimated to increase the odds of survival by a factor of 1.083 (P = 2.5 × 10−11). The positive effect of POLY EDU on survival is not surprising because it is significantly associated with many other behavioral and health-related traits in Iceland. For example, POLY EDU is positively correlated with high-density lipoprotein levels, and negatively correlated with triglyceride levels, body mass index, glucose fasting levels, and amount of smoking (P < 1 × 10−30 for each of these five quantitative traits; Table S3). Because POLY EDU has a substantial impact on lifespan, when the samples were ascertained, there would be a positive ascertainment bias, particularly with those born before 1940, for those with high polygenic scores due to the greater likelihood to be alive at the time of ascertainment than those with low polygenic scores. This survival effect has a real impact on the difference in POLY EDU between the young and the old in the population at any given time. However, for the purpose of estimating the change of the average polygenic score over time with respect to birth cohorts, this can be a source of bias. This bias is expected to be small for individuals with yob ≥1940. Using the latter, the estimated rate of decline of the average polygenic score is −0.0122 SU per decade (P = 2.4 × 10−7, SE = 0.0024) (red line in Fig. 2). For comparison, we computed two other polygenic scores based on meta-analyses for height and schizophrenia. The polygenic score for height is not significantly associated with yob (P ≥ 0.5). The polygenic score for schizophrenia is estimated to decline at a rate of −0.0078 SU per decade (P = 1.1 × 10−3, SE = 0.0024) for individuals with yob ≥1940.

Fig. 2. Average educational attainment polygenic score and year of birth (yob). Results for 10-year bins are presented. Error bars indicate plus/minus 1 SE. The blue line is a quadratic fit for the full yob range indicated. The red line is a linear fit applied to individuals with yob ≥1940.

Table S3. Association between POLY EDU and five quantitative traits

An alternative to estimating the rate of decline of POLY EDU is to perform calculations based on the information about reproductive history. If generations were discrete, then the contribution from each parent type (mother/father) to the change of the average polygenic score for the next generation is (eff/2)/(ANC), where eff is the effect of POLY EDU on number of children and ANC is the average number of children. For the females in Table 1, eff = −0.084 and ANC is 2.84, and the estimated contribution to the change per generation is (−0.084/2)/2.84 = −0.015 SU. Given that the average AACB for these females is 27.5 years, this translates to −0.015/27.5 = −0.00054 SU per year, or −0.0054 SU per decade. For the males in Table 1, eff = −0.054, ANC = 2.73, and average AACB = 30.0, translating to an effect of −0.0033 SU per decade. Combining the contributions from females and males gives a change of −0.0087 per decade. This estimate, however, does not take into account that individuals with high POLY EDU tend to have their children later (Table 1), leading to a slower contribution to the generations that follow. After applying equations derived for incorporating the generation time effect (28, 29) (Materials and Methods), the female and male contribution is estimated, respectively, to be −0.0065 and −0.0039 SU per decade, with the sum equal to −0.0104 SU per decade. This estimate is smaller in magnitude than the −0.0122 SU per decade estimate based on the observed decline. However, because the difference is within 1 SE, the two estimates can be considered as consistent.

Although there are challenges to getting a precise estimate of the rate of change of the average POLY EDU value due to nonsampling errors that could be difficult to gauge, with the analyses taken together we consider −0.010 SU per decade to be a reasonable estimate for the period from 1910 and 1990 that is more likely to underestimate than overestimate the true decline. Most importantly, POLY EDU is just a fraction of the full genetic component of educational attainment, which we denote by POLY FULL . It is the rate of change of POLY FULL that is of ultimate interest. Under an assumption that the part of POLY FULL that is not captured by POLY EDU behaves in a similar fashion in its impact on reproduction, the rate of change is proportional to the square root of the variance explained (SI Text). Thus, if POLY FULL is assumed to account for 30% of the variance of EDU, then its estimated rate of change, by extrapolation, is −0.010 × (30/3.74)1/2 = −0.028 SUs per decade. To test the validity of this method of extrapolation we computed a separate polygenic score for educational attainment, denoted by POLY -U.K.B , which was based on the same GWAS results used to construct POLY EDU , except that the contribution from 111,349 UK Biobank samples was removed (Materials and Methods). When we applied POLY -U.K.B to the Icelandic data, it explained 2.52% of the variance of EDU, and the rate of decline estimated based on its effects on reproduction is −0.0085 SU per decade (Materials and Methods). Hence, with the polygenic score strengthening from POLY -U.K.B to POLY EDU , the estimated rate of decline increased by a factor of (0.0104/0.0085) = 1.22, nearly identical to (3.74/2.52)1/2 = 1.22, the square root of the variance explained ratio.