Sex-typed toy preference is one of the earliest observed sex differences in behavior, becoming apparent in children as young as 12 months (Servin, Bohlin, & Berlin, 1999; Todd, Barry, & Thommessen, 2017; Todd et al., 2018; van de Beek, van Goozen, Buitelaar, & Cohen-Kettenis, 2009). Whereas such sex differences in toy preference are only modest in size around one year of age (for a review, see Zosuls & Ruble, 2018), they increase with age (Golombok et al., 2008; for a review, see Todd et al., 2018) with very large effect sizes of about Cohen’s d = 3 for preschool and primary school children (for a review and meta-analysis, see Davis & Hines, 2020; Hines, 2010). However, the size of these sex differences depends on the method used to determine sex-typed play preference. Observational studies typically find sex differences for playing with cars/trucks and dolls (for reviews, see Davis & Hines, 2020; Zosuls & Ruble, 2018) that are smaller than for parents’ reports on sex-typed play preferences in a questionnaire (for a review, see Hines, 2010). Noticeably, not only children, but also non-human primates seem to prefer to play with sex-specific toys (Alexander & Hines, 2002; Hassett, Siebert, & Wallen, 2008). This is notable because non-human primates are obviously less subject to social influences. Both the early appearance of sex-specific play preferences in children, as well as the evidence from non-human primates, suggest a biological component in the emergence of sex-typed play preferences in addition to socialization influences.

Sex hormones that are involved in the development of primary and secondary sexual characteristics in ontogenesis are a likely candidate for such a biological component. However, sex-typed play behavior appears long before puberty in a time frame in which there are no differences in sex hormone levels between boys and girls (Hines, 2010). Following from this, early prenatal effects of sex hormones have been suggested to impact sex-typed preferences in children (for a review, see Berenbaum & Beltz, 2016). For example, girls with congenital adrenal hyperplasia (CAH) who are exposed to high levels of androgens prenatally and in the early postnatal period show more male-typical play behavior than unaffected controls (Berenbaum & Hines, 1992; Nordenström, Servin, Bohlin, Larsson, & Wedell, 2002; Pasterski et al., 2005) and parents describe their daughters behavior as more masculine in comparison with unaffected female relatives (Hines, 2003). Additionally, women with CAH retrospectively describe their play behavior as more masculine (Hines, Brook, & Conway, 2004). This is further corroborated by experimental studies on female rhesus macaques that displayed a strong increase in rough-and-tumble play after they had been prenatally treated with androgens (for a review, see Thornton, Zehr, & Loose, 2009).

Notably, testosterone levels differ between boys and girls from around Week 8 of gestation (Judd, Robinson, Young, & Jones, 1976) and are presumed to have organizational effects on the brain. In turn, they should affect sex differences in behavior later in life (for a review, see Cohen-Bendahan, van de Beek, & Berenbaum, 2005). In this context, one study indicated that girls and boys exposed to higher levels of testosterone in amniotic fluid showed more male-typical play behavior later in life (Auyeung et al., 2009). By contrast, two other amniocentesis studies were not able to find this relationship (Knickmeyer et al., 2005; van de Beek et al., 2009). However, all three studies used different measures to quantify sex-typed play preferences and the children differed in age.

Thus, amniocentesis studies, which are very rare due to the high effort involved, show inconsistent results. However, studies on the relationship between sex-typed play behavior and the ratio between the second and fourth digit length (a presumed marker for the prenatal testosterone level) show more consistent results (Hönekopp & Thierfelder, 2009; Mitsui et al., 2016; Wong & Hines, 2016). Following the observation that females have larger ratios between the second and fourth digit (2D:4D) than males, Manning, Scutt, Wilson, and Lewis-Jones (1998) suggested the 2D:4D as an easily accessible marker for prenatal testosterone exposure.

Two meta-analyses confirmed a sex difference in 2D:4D (Grimbos, Dawood, Burriss, Zucker, & Puts, 2010; Hönekopp & Watson, 2010). This sex difference is already established prenatally as suggested by two studies on aborted fetuses (Galis, Ten Broek, Van Dongen, & Wijnaendts, 2010; Malas, Dogan, Hilal Evcil, & Desdicioglu, 2006). In favor of 2D:4D as a marker for prenatal testosterone exposure, females with CAH tend to have masculinized (i.e., smaller) 2D:4D (Brown, Hines, Fane, & Breedlove, 2002; Ökten, Kalyoncu, & Yaris, 2002). In mice, increasing androgens and reducing estrogens in utero decreased 2D:4D (Zheng & Cohn, 2011).

Even though there is convincing evidence that 2D:4D is influenced prenatally by sex hormones, the specific relationship with prenatal sex hormone levels from amniotic fluid and umbilical cord blood in humans is still not fully understood. There are only two published studies that link 2D:4D to prenatal sex hormones from amniotic fluid with inconsistent findings. One study found that more masculine 2D:4D on the right hands of 29 2-year-olds (not separated for sex) were associated with a higher ratio of testosterone to estradiol levels (Lutchmaya, Baron-Cohen, Raggatt, Knickmeyer, & Manning, 2004), while another study found that the right and left 2D:4D in newborn girls (but not boys) to be correlated with amniotic fluid testosterone (Ventura, Gomes, Pita, Neto, & Taylor, 2013). Only one study using umbilical cord blood (sampled at birth) to measure sex hormones found an expected negative relationship between testosterone and left 2D:4D in girls (Whitehouse et al., 2015), while others failed to show this negative relationship (Çetin, Can, & Özcan, 2016; Hickey et al., 2010; Hollier et al., 2015). Additionally, in case testosterone effects on 2D:4D are mediated by androgen receptor type, there should be a positive correlation between functional androgen receptor gene variation (CAG stretches) and 2D:4D. Two meta-analyses have failed to show a relationship between 2D:4D and CAG stretches (Hönekopp, 2013; Voracek, 2014). Therefore, 2D:4D as a marker for prenatal hormone exposure should be interpreted with caution (for an overview of existing evidence, see Richards, 2017).

Assuming that 2D:4D is determined (at least partially) by prenatal hormones, the temporal stability of the sex difference in 2D:4D should be high, and sex differences should be present early in life. While a cross-sectional study on 2- to 25-year-old individuals found no age differences in 2D:4D (Manning et al., 1998), another cross-sectional study showed an increase in 2D:4D with age in 2- to 5-year-olds (Williams, Greenhalgh, & Manning, 2003). Rare longitudinal studies, however, are better suited to evaluate the stability of 2D:4D controlling for inter-individual differences. Such studies have shown a slight increase in 2D:4D with age (McIntyre, Cohn, & Ellison, 2006: T1: 6–7 years, T2: 8–9 years; McIntyre, Ellison, Lieberman, Demerath, & Towne, 2005: T1: 1 year, T2: 5 years, T3: 9 years, T4: 13 years, T5: 17 years; Trivers, Manning, & Jacobson, 2006: T1: 7–13 years, T2: 11–17 years; Wong & Hines, 2016: T1: 20-40 months, T2: 26-47 months). However, high correlations between the measurements (Pearson’s r = .71–.88) suggest a high temporal stability of 2D:4D. By contrast, one longitudinal study on 0- to 2-year-olds showed a decrease of 2D:4D in the first year and an increase in the second year of life and low correlations (Pearson’s r = .35–.53) between measurements (Knickmeyer, Woolson, Hamer, Konneker, & Gilmore, 2011: T1: 2 weeks, T2: 12 months, T3: 24 months).

Supporting the validity of 2D:4D, three studies (Hönekopp & Thierfelder, 2009; Mitsui et al., 2016; Wong & Hines, 2016) have shown that children with lower 2D:4D (and, thus, supposedly higher testosterone exposure in utero) display more masculine play behavior (as described by the parents’ answers on the Preschool Activities Inventory; PSAI; Golombok & Rust, 1993). Nevertheless, there is considerable discrepancy concerning the side of the hand (right/left) and the sex of the children in which the correlations were found (Hönekopp & Thierfelder, 2009: left 2D:4D of boys; Mitsui et al., 2016: right and left 2D:4D of boys; Wong & Hines, 2016: right 2D:4D of boys and right and left 2D:4D of girls).

In addition to a likely biological effect on play behavior, socialization plays an important role. It is well known that children’s (sex-specific) toy preferences are influenced by parents, teachers, (older) siblings, and peers. For example, children are often reinforced for sex-congruent behavior (for reviews, see Berenbaum, Blakemore, & Beltz, 2011; Hines, 2010). Whereas the reinforcement of sex-congruent behavior of parents and teachers and the influence of peers is difficult to quantify, studies that have recorded the number of older brothers and sisters have shown that both girls and boys with more older brothers displayed more male-typical and less female-typical behavior and with more older sisters more female-typical and less male-typical play behavior (Hines et al., 2002b; Mitsui et al., 2016; Rust et al., 2000). According to social cognitive theory, this effect of older siblings is explained by observational learning and thus as a socialization factor (Berenbaum et al., 2011; Rust et al., 2000), which seems plausible because children may play with older siblings and their toys. However, it cannot be ruled out that the influence of older siblings on the behavior of younger siblings may also (partly) be based on a genetic or hormonal and, thus, a biological component (Berenbaum et al., 2011). In this context, the fraternal birth-order effect is worth mentioning as it describes the increased probability for a younger brother to be gay with an increasing number of older brothers. This effect is explained as a consequence of a progressive immunization of some mothers against male antigens with each pregnancy with a male fetus and a simultaneous increase in antibodies that affect the sexual differentiation of the brain (for a meta-analysis, see Blanchard, 2018). Bogaert et al. (2018) found antibody levels against the Y-linked protein NLGN4Y that is important in brain development, to be higher in mothers of gay sons than in the control samples. Additionally, in a large community sample, later fraternal birth order was related to elevated gender variance in boys (Coome, Skorska, van der Miesen, Peragine, & VanderLaan, 2018) which indicates that the progressive immunization hypothesis is not only valid for homosexuality but also for sex-typed behavior in childhood. However, this effect has only been shown for boys with older brothers, so that the observed relationship between the sex-typed play behavior of boys and the number of older sisters as well as the relationship between the sex-typed play behavior of girls and the number of older brothers and sisters (Hines et al., 2002b; Mitsui et al., 2016; Rust et al., 2000) cannot be attributed to the birth-order effect and therefore most likely indicates socializing effects. To date, there are no studies that investigated both the influence of older siblings and prenatal hormonal effects in order to be able to make a statement as to whether the effects are additive or interactive (Berenbaum et al., 2011).

The current study aims at clarifying the relationship between sex-typed play behavior and 2D:4D. Only one of the three studies examining this relationship (Hönekopp & Thierfelder, 2009; Mitsui et al., 2016; Wong & Hines, 2016) assessed digit ratio in a longitudinal design (with only two measurements on 2- to 3-year-olds over a 6- to 8-month period; Wong & Hines, 2016). By contrast, the present longitudinal study consisted of four measurements of the digit ratios from both hands at different ages starting in early infancy (T1: 5 months, T2: 9 months, T3: 20 months, and T4: 40 months). The sex differences in 2D:4D should be stable if they are indeed influenced by prenatal testosterone. Therefore, we need multiple measurements at different ages during early infancy and childhood to draw conclusions about 2D:4D as a potential marker for prenatal testosterone. Our longitudinal design has a clear advantage over a cross-sectional design because it reduces inter-individual variance and increases the signal-to-noise ratio. At T4, parents completed the PSAI (Golombok & Rust, 1993) to record sex-typed play behavior of their children. The PSAI is a standardized, frequently used measure for play preferences that shows large sex differences (Hönekopp & Thierfelder, 2009; Mitsui et al., 2016; Wong & Hines, 2016) which are typically larger than in direct observations of toy preferences in a single, unnatural laboratory situation (Hines, 2010; Wong & Hines, 2016). It also has the advantage that it not only asks for toy preferences, but also for activity preferences and temperamental characteristics during the last month, giving a more comprehensive picture of sex-typed play behavior in comparison with observational studies of toy preferences.

Additionally, the number of older brothers and sisters (living in the same household) was assessed as an indicator for socialization effects on sex-typed play behavior (Mitsui et al., 2016; Rust et al., 2000). This allows us, for the first time, to assess the combination of potential biological (2D:4D) and socialization influences (older siblings) on sex-typed play behavior (Berenbaum et al., 2011).

Based on previous research, we predicted more male-typical play behavior in boys than in girls (operationalized by the PSAI score) and lower digit ratios in boys than in girls, independent of age and hand (right/left). For the temporal stability of 2D:4D, we expected a slight increase with age independent of the sex difference. Moreover, we predicted that boys and girls with lower digit ratios (supposedly higher prenatal testosterone exposure) should display more male-typical and less female-typical play behavior (higher PSAI scores). With respect to older siblings, more older brothers should lead to more masculine and less feminine play behavior (higher PSAI scores) and more older sisters to less masculine and more feminine behavior (lower PSAI scores) in both boys and girls.