Heritability is a population parameter that can be estimated using different experimental designs. Until technological advances facilitated the direct measurement of DNA variants, the heritabilities of various humans were mostly estimated by comparing the resemblance of twins, adoptees, and other pairs of relatives.

To explain heritability, a simple and highly stylized causal model is useful; our treatment follows Benjamin et al. 6 We assume the genetic variants influencing the outcome of questions are located at J separate locations (“loci”) in the genome. At each locus, individuals are endowed with two alleles, one inherited from the mother and one from the father. We can arbitrarily designate one of these to be the reference allele, and define person i’s genotype at locus j, x ij , by the number of reference alleles person i is endowed with (because we make the simplifying assumption of only two alleles, this number is always equal to 0,1 or 2).

We assume the following causal model for person i’s outcome:

$${Y_i}={\underbrace{\sum

olimits_{j=1}^{J}{x_{ij}}{\beta_j}}\limits_{\equiv G_i}+{U_i},}$$ (1)

where j indexes the loci, β j is the causal impact of an additional copy of the reference allele at locus j (this impact need not be the same across time and space), and U i is an environmental variable. If Y i is normalized so its standard deviation is 1, and G i and U i are uncorrelated, the total variance in Y i is the sum of two components: a genetic factor (h 2≡var(G i )) and a non-genetic factor (u 2≡var(U i )). The parameter h 2—known as (narrow sense) heritability—is simply the R 2 from the population regression of the outcome on the J genotypes.

One can make inferences about the h 2 of a trait without knowing the specific genes responsible for the heritable variation. Prior to the widespread availability of molecular data, most such efforts have relied on comparisons of the resemblance in the observable characteristics (“phenotypes”) of various pairings of relatives who vary in their degree of environmental and genetic resemblance. In these studies, a commonly made assumption is that the distributions of G i and U i are the same across all types of siblings. Randomly ordering the two members of a sibling pair and denoting the second member of a pair by a prime, most heritability estimates are based on identifying conditions that have the general form:

$${\rho _{\rm YY'}^{\rm k}}=E({Y_i}{Y_i'})={\rho _{\rm GG'}^{\rm k}}{h^2}+{\rho _{\rm UU'}^{\rm k}}{u^2}.$$ (2)

On the left-hand side of Eq. 2 is the phenotypic correlation, whose sample analog is easily estimated by drawing a random sample of sibling pairs of type k and calculating the pairwise correlation between their outcomes. The methodology of studies in “behavior genetics” is fundamentally about using estimated phenotypic correlations \({\hat{\rho} _{\rm YY'}}\) for various types of siblings to infer population parameters such as h 2 and u 2. In general, assumptions about \({\rho _{\rm GG'}}\) are often based on population genetic theory,7 whereas inferences about \({\rho _{\rm UU'}}\) are based on whether the siblings of a given type were raised in the same household.

For example, most twin studies compare the resemblance of monozygotic twins reared apart (k = MZT) with the resemblance of dizygotic twins reared together (k = DZT). In studies attempting to estimate heritability from twin data, a commonly made assumption is that Eq. 2 holds for both types of twins, with \({\rho _{\rm GG'}^{\rm MZT}}=1\), \({\rho _{\rm GG'}^{\rm DZT}}=0.5\) and \({\rho _{\rm UU'}^{\rm DZT}}={\rho _{\rm UU'}^{\rm MZT}}\equiv {\rho _{\rm UU'}^{\rm T}}\). If these conditions hold, it is straightforward to verify that \(2({\rho _{\rm YY'}^{\rm MZT}}-{\rho _{\rm YY'}^{\rm DZT}})={h^2}\). An estimator of heritability is the sample analog of this moment condition, also known as Falconer’s estimator: \({\hat{h}^2}=2({\hat{\rho}_{\rm YY'}^{\rm MZT}}-{\hat{\rho}_{\rm YY'}^{\rm DZT}}).\)

To illustrate some canonical findings from the literature, Fig. 1 displays phenotypic correlations for three education phenotypes—years of schooling, cognitive skills, and socioemotional skills for—and two anthropometric phenotypes—height and body mass index – in seven types of Swedish sibling pairs who differ in their genetic and environmental relatedness. The correlations are computed using administrative data covering all Swedish brother pairs born between 1951 and 1970, and have been previously reported.8,9,10 With the exception of years of schooling, all data are from Swedish conscription data, and measured around age 18 (the year of enlistment).

Fig. 1 Sibling Correlations for Behavioral Traits. This figure displays sibling correlations for five traits measured in a large sample of Swedish brother pairs born 1951–1970. All outcomes except years of schooling are measured at conscription, around the age of 18. For details on sample construction and variable definitions, see chapter 3 in Cesarini.10 The sample sizes vary by outcome, but the minimum number of pairs per sibling type is MZT = 1,154; DZT = 1,601; FST = 151,789; FSA = 1,033; HST = 4,880; HSA = 11,566; ADO = 643. Because the sample sizes are large, all correlation coefficients are precisely estimated. The standard error of a correlation coefficient estimated with N pairs of siblings is approximately \((1-{\hat{\rho }}^{2})/\sqrt{N}\). For example, the approximate standard error of the 72% estimate reported in the full text (full siblings reared apart) is \(2\times (1-{\hat{\rho }}_{{\rm{FSA}}}^{2})/\sqrt{N}=2\times (1-{0.359}^{2})/\sqrt{1033}\approx 0.054\). Full size image

Our measure of cognitive skills is derived from the conscript’s score on four cognitive tests (synonyms, spatial skills, inductions and technical comprehension) and is highly correlated with what is sometimes referred to as general intelligence.11 Our measure of socioemotional skills is based on a professional military psychologist’s assessments of the conscript’s ability to function in the military, with higher scores assigned to recruits, which the psychologist perceives as independent, emotionally stable, able to function in a group and willing to take on responsibility.12 Lindqvist and Vestman12 document that the variable (which they call noncognitive ability) is a much stronger predictor of labor market outcomes than the personality dimensions measured by standard personality scales.

In these analyses, two brothers are classified as “reared apart” if they lived in separate households during every census undertaken before the age of 18, and “reared together” otherwise. We use information about biological parents to classify siblings reared in the same household as full brothers (same biological parents), half-brothers (share one biological parent), and adoptees (share no biological parents but reared in the same household). Two broad patterns are evident from Fig. 1 First, for all five traits, the phenotypic resemblance of pairs of siblings reared together increases with genetic relatedness. Second, holding constant genetic relatedness, siblings reared in the same household are usually more similar than siblings reared in separate households.

Until recently, data such as those in Fig. 1 were the primary source of information about the heritability of various traits. To illustrate how sibling correlations can be used to decompose phenotypic variation, applying Falconer’s formula to the correlations gives \({\hat{h}^2}=2({\hat{\rho_{\rm YY'}^{\rm MZT}}}-{\hat{\rho}_{\rm YY'}^{\rm DZT}})=2(0.822-0.534)= 58 \%\) for cognitive skills and \({\hat{h}^2}=2({{\hat\rho _{\rm YY'}^{\rm MZT}}}-{\hat{\rho} _{\rm YY'}^{\rm DZT}})=2(0.928-0.521)=81 \%\) for height. With seven sibling types, many other estimators are available. For example, if we assume that in full siblings reared apart, \({\rho _{\rm GG'}^{\rm FSA}}=0.5\) and \({\rho _{\rm uu'}^{\rm FSA}}=0,\) then \(2{\rho_{\rm YY'}^{\rm FSA}}={h^2}\). The analogy principle then suggests the estimator \({\hat{h}^2}=2{\hat{\rho }_{\rm GG'}^{\rm FSA}}\) , giving us \({\hat{h}}^{2}=2\times 0.359=72 \%\) for cognitive skills and \({\hat{h}}^{2}=2\times 0.468=94 \%\) for height. In practice, if feasible, it is almost always advisable to use an estimate that incorporates information from as many different sibling types as possible (not just twins). Most importantly, information about additional sibling types provides identifying variation that can be used to estimate richer models that relax (or test) some of the potentially problematic assumptions underlying Falconer’s formula.

Though the data in Fig. 1 are quite representative of findings in the behavior–genetic literature, it bears emphasizing that heritabilities are population-specific parameters (not universal constants). Heritabilities can (and do) vary across time and space and for some traits, they can also vary in interesting ways over the lifecycle. For example, one of the most robustly replicated findings from the behavior–genetic literature is that the heritability of cognitive skills rises gradually through childhood and adolescence.13