A co-authored Sexual Personalities post by Marco Del Giudice, David A. Puts, David C. Geary, and David P. Schmitt

“Eight Things You Need to Know About Sex, Gender, Brains, and Behavior: A Guide for Academics, Journalists, Parents, Gender Diversity Advocates, Social Justice Warriors, Tweeters, Facebookers, and Everyone Else” is a new, -worthy article by Cordelia Fine, Daphna Joel, and Gina Rippon (Fine et al., 2019). These authors worry that sex differences are often portrayed as all-or-none binaries (“females are like this and males are like that”) and people tend to unthinkingly ascribe these differences to immutable biological factors (a form of psychological essentialism; Gelman & Rhodes, 2012). Such thinking is, indeed, both incorrect and a detriment to advancing a proper science of sex differences. In the article, Fine and colleagues set out to provide guidelines and thinking points for academics and the broader public, to help them critically evaluate reports of sex differences in brain and behavior. This is a laudable goal, and we fully support it.

(Note: As the focus of the original article is on group comparisons between males and females, here we use “sex” as a descriptive term. The distinction between biological “sex” and social “ ” is famously murky (Eagly, 1987), and we are skeptical that the hybrid “sex/gender” will do much to clarify the issue; see Del Giudice, 2019).

A Counterbalanced Perspective

Over the years, we have studied many aspects of sex differences and their evolution, and we find several things to like in the article by Fine and colleagues. At the same time, their arguments seem based on an underlying assumption that most sex differences are small, inconsistent, highly malleable, and for the most part socially constructed. As a result, their prompts for critical thinking tend to be somewhat one-sided; if applied automatically, they lead to a biased assessment of research in this area.

As we see it, minimizing the magnitude of important sex differences and discounting their biological origins can be just as damaging (for science and society at large) as exaggerating them and accepting simplistic biological explanations of sex differences at face value. All too easily, one-sided criticism can devolve into mental simplicity, overconfidence, intolerance, and evidentiary double standards (van Prooijen & Krouwel, 2019). An honest, sophisticated public debate on sex differences demands a broad perspective with an appreciation for nuance and full engagement with all sides of the question.

In the interest of such a balanced approach, then, we propose eight “Counterpoints” to Fine and colleagues’ original “things you need to know” about sex differences. In each, we note where we agree with the authors, point out any disagreements, and suggest other important things to know that were not mentioned in the original article. To keep our eight Counterpoints brief and readable, we do not develop our arguments at length but provide links to examples, key studies, and in-depth discussions.

Counterpoint 1: False Positives and False Impressions

We second the authors’ warnings about sexual science being prone to false positives and publication . These problems are pervasive in all of science, and sex difference research is certainly not immune. Neuroimaging studies of sex differences may be particularly vulnerable because of their small sample sizes (but see Counterpoint 6 below). That said, key sex differences in and behavior have been replicated multiple times in very large samples (thousands to hundreds of thousands), using consistent methods and measures. Arguably, some of these findings—for example on interest for things vs. people, mate preferences, , and —are among the most robust and replicable effects in psychology. Ironically, the decades-long skepticism regarding the existence or importance of sex differences has resulted in more attempts to replicate and thus a deeper knowledge base than is the case in other areas of the behavioral and brain sciences (see Archer, 2019; Geary, 2010).

The mirror image of standard publication bias is so-called “reverse bias”: Findings that are unwelcome for theoretical or ideological reasons are often hard to publish, and end up being suppressed (e. ., not sent for peer review by editors) or downplayed by researchers (for a compelling example, see Clark & Hatfield, 2003). A theme of Fine and colleagues’ essay is that bias always works to amplify sex differences, but we strongly suspect that this is not the case.

At least since Maccoby and Jacklin's 1974 classic book, The Psychology of Sex Differences, the prevailing zeitgeist in psychology has held that there are few if any substantive differences across the sexes (critical takes on that book, such as this detailed reanalysis by Block 1976, are largely forgotten). Even a cursory look at introductory psychology textbooks will show that this is still the case fifty years later (Winegard et al., 2014). On top of that, many psychologists and neuroscientists—including Fine and colleagues themselves—have argued that claims of biological sex differences can be dangerous and socially harmful. To the extent that these ideas are widely shared (Buss & von Hippel, 2018; Horowitz et al., 2014; Jonason & Schmitt, 2016), it creates obvious incentives for powerful reverse bias effects. Note that reverse bias can be particularly insidious: If a certain effect is never published or discussed in the literature, it may go completely unrecognized for some time.

One of us encountered a possible case of reverse bias while running a meta-analysis of sex differences in romantic . Of the eligible studies, only about 20% included enough information to calculate sex differences and even fewer did actually calculate and discuss them in the paper (Del Giudice, 2011). In the writing of a book on sex-specific vulnerabilities (Geary, 2015), another one of us found the same bias (i.e., no reporting of sex differences or statistically removing them as a “nuisance”) across a wide range of fields, including toxicology, parasitology, and oncology. While bias in certain fields may favor the reporting of sex differences, there are many fields in which researchers tend to turn a blind eye on them. We agree with feminist Alice Eagly (1987), the National Institutes of Health Office for Research on Women’s Health, and others who argue that sex differences should always be accurately recorded and reported, regardless of their size or significance.

Was the study pre-registered? What is the sample size? How consistent is the finding with other similar studies? How are variables defined and measured? We would also point out that these questions listed by Fine and colleagues apply just as well to studies that do not find evidence of sex differences (or find differences that are unusually small). Small samples increase the risk of false negatives in addition to that of false positives, and researchers may use hidden degrees of freedom to deflate an effect instead of inflating it.

Summary of Counterpoint 1: False positives and publication bias are important concerns for all of science. In the case of human sex differences, it is also plausible that many effects have remained unpublished or not discussed in the literature because of reverse bias. Methodological issues and cut both ways.

Counterpoint 2: Size Matters

Fine and colleagues rightly point out that statistical significance says virtually nothing about the theoretical or practical significance of an average difference. To make sense of standard effect sizes (Cohen’s d), they recommend converting them into proportions of distribution overlap (OVL). While this is correct, an overlap is only slightly more intuitive than d as a description of effect size. There are a number of alternative indices, which—depending on context—can be more informative and easier to interpret than overlap (see Del Giudice, 2019 for a detailed overview). The “common language effect size” (CL) is especially intuitive: It is the probability that a female picked at random will score higher on a trait than a male picked at random (or vice versa, depending on which sex has the higher mean). When Cohen’s d is 0.4, this probability is ~61%; when d is 1.0, the probability is ~76%.

Another useful index is the probability of correct classification (PCC)—that is, the probability of correctly guessing the sex of an individual based on his/her score. The PCC is ~58% when d is 0.4 and ~69% when d is 1.0. This chart can be used to easily convert d into various other indices:

Source: Del Giudice (2019)

Figure 1. Conversion chart between Cohen’s d (or Mahalanobis’ D) and other effect sizes, assuming multivariate normality and equal covariances between groups. OVL = overlapping coefficient; OVL 2 = Cohen’s coefficient of overlap; U 3 = proportion of one group above the median of the other group; CL = common language effect size; PCC = probability of correct classification (with equal group size); h2 = proportion of variance explained. (Del Giudice 2019).

An important issue that Fine and colleagues do not address is measurement error. To the extent that measurement is not perfectly accurate, the size of sex differences (as measured by d and other indices) will appear smaller than it is in reality. When measures are noisy—as is the case with many psychological and neurobiological measures—sex differences can look much smaller than they really are.

There are several statistical methods for correcting measurement error, from simple (correction for unreliability) to sophisticated (latent variable models; for an overview see Del Giudice, 2019). In our experience with questionnaire-based research, the size of sex differences typically increases by 10-20% after simple corrections, and may almost double when using more sophisticated methods. Note that, in the famous literature syntheses by Hyde et al. (2005) and Zell et al. (2015), effect sizes were not corrected for measurement error. Likewise, studies using brain imaging and measures rarely attempt to correct for the distorting effects of noise.

Even without accounting for measurement error, some behavioral differences are considerably larger than Fine and colleagues suggest in their summary. For example, large studies of sex differences in interest for people vs. things show a d of about 1.0 to 1.4 (Lippa, 2010; Morris, 2016). Accordingly, the probability that a randomly picked female will be more interested in people (and less in things) than a randomly picked male is about 75-85%. Some sex differences in mate preferences are comparable in size, and many physical sex differences are even larger (Conroy-Beam et al., 2015; Puts, 2016). The effect size for to males vs. females—one of the largest psychological differences between the sexes—is in the range of d = 6.0 (see Puts, 2016). This raises the question of which traits are theoretically expected to be strongly dimorphic, and which ones are not. Clearly, psychologists should not (and do not) expect all possible traits to show large sex differences. This is why studies that summarize differences across every possible measure (e.g., Zell et al., 2015) are not very informative, and why evolutionary theory can be invaluable in guiding research on sex differences, helping researchers know where to look and understand how any such differences fit into the broader patterns found across species (Archer, 2019; Buss, 1995; see Counterpoint 8 and this Sexual Personalities post).

All of this is just a preliminary to what we believe is our most important take-home (counter)point: The importance of a difference cannot be judged exclusively by the size of the effect. It does not matter if one is using d, OVL, or any other statistical index—a difference that is “large” for one purpose can be “small” for another, and there is no way to tell without knowing the context (see Del Giudice, 2019; Funder & Ozer, 2019; Prentice & Miller, 1992). Even a minuscule effect can be theoretically crucial if it decides between two rival hypotheses. On the practical side, “small” average differences can have dramatic consequences, especially at the extremes of a trait or behavior (distribution tails). For example, there is a lot of overlap between the sexes in physical aggression (d is about 0.6), and yet males commit almost 90% of all homicides worldwide (Archer, 2019; Duntley & Buss, 2011). For an in-depth discussion of the relations between average differences and tail differences, see Voracek et al. (2013) and Del Giudice (2019). Conversely, even nominally “large” differences can be inadequate if the goal is, say, accurate classification: when d is 1.0, the PCC is about 69%, higher than chance but far from ideal.

A note on terminology: We are puzzled by Fine and colleagues’ claim that the phrase “sexual dimorphism” can only be used when the male and female distributions are “distinct populations” (i.e., they show virtually no overlap). We are not sure about the source of this claim (which was also made by Hyde et al., 2019), but in biology, the term is routinely used to describe any systematic difference between the sexes, regardless of the amount of overlap. To quote from a recent encyclopedia of evolutionary biology: “Any trait that differs on average between sexes is considered sexually dimorphic, even if the trait distributions overlap considerably between sexes. Height in humans provides a familiar example of this type of sexual dimorphism” (Fairbairn, 2016, p. 105). This conception of sexual dimorphism being variable and one of degree is fundamental to evolutionary biology, and the source of this variation across species is a hotly contested area of research (e.g., Colwell, 2000; Stephens et al., 2009).

Summary of Counterpoint 2: The size of sex differences can be summarized with a number of different indices, which may be more useful and informative than overlap. Unless corrected, measurement error can substantially reduce the apparent size of differences. Crucially, the importance of a sex difference cannot be judged exclusively by the effect size. “Sexual dimorphism” doesn’t mean what some scientists think it means.

Counterpoint 3: What Type of Difference are We Talking About?

Fine and colleagues correctly note that sex differences found at one age might not be found at another, but they seem to imply this means that even when differences are found they are ephemeral. It seems only universal, stable, and constant sex differences would be “real” in their view. We would counter that evolutionary biologists have known about the age-dependent emergence of many sex differences since before Darwin. As Darwin described 150 years ago:

“There is ... striking parallelism between mammals and birds in all their secondary sexual characteristics, namely in their weapons for fighting with rival males, in their ornamental appendages, and in their colors. In both classes, when the male differs from the female, the young of both sexes almost always resemble each other, and in a large majority of cases resemble the adult female. In both classes, the male assumes the characters proper to his sex shortly before the age for reproduction” (Darwin, 1871, Vol. II, p. 297)

The early similarities result from natural selection—in many cases, as related to predator avoidance (e.g., camouflage plumage) and other challenges to survival. The sex differences that emerge just before the age of reproduction are typically related to sexual selection (see Counterpoints 7 and 8 below). Human sex differences also wax and wane across other life stages, including middle (Del Giudice, 2014) and post-reproduction (e.g., ; Cant & Croft, 2019). These age-sensitive shifts do not prove the ephemeral nature of human sex differences. Evolutionary theories expect that sex differences will not be stable across the life course but will change depending on the biological associated with each stage (e.g., play learning, mating , mate retention, , grandparenting, and so forth; Geary, 2010).

Fine and colleagues also highlight that many sex differences appear dependent on cultural or ecological factors. Readers might infer that cultural variability indicates that a particular sex difference is “transient” and so not the product of adaptive mechanisms. This is a very common misconception of evolved sex differences. We would counterpoint that there are many reasons why adaptive sex differences can be culturally variable (Schmitt, 2015). Indeed, even truly obligate sex differences (i.e., sex differences that similarly emerge across cultures; Lippa, 2009) are never immutable. A continuous interplay of biological and cultural factors almost always underlies the development of sex differences, so environments frequently mediate the degree to which evolved sex differences emerge.

More importantly, in many cases, the size of psychological sex differences varies across cultures as a direct result of psychological adaptations that are designed in ways that are sensitive to social and ecological factors (Schmitt, 2015). For instance, although sex differences in height are largely obligate across cultures, they can be biologically suppressed in cultures with especially poor and extremely ecological conditions. As Gaulin and Boster (1992) noted in a review of sex differences in stature across 155 human societies, “substandard nutrition could cause individuals to fall short of their genetically set growth potential, and, importantly, males seem to be more sensitive to such developmental perturbations than females” (p. 474; see also Geary, 2015). Hence, in high-stress ecologies, sex differences in height can be greatly reduced. To infer that sex differences in height are merely “transient” and unrelated to our biological evolution, however, would be a serious mistake.

We would also point out that cultural variability (or lack thereof) in human sex differences can be theoretically informative for non-evolutionary theories favored by Fine and colleagues. Social role theory, for instance, expects perceived gender roles, gender socialization processes, and socio-structural power differentials act on the androgynous blank slated minds of boys and girls (Eagly & Wood, 1999). Eagly and her colleagues have argued that “men and women have inherited the same evolved psychological dispositions” (Eagly & Wood, 1999, p. 224), “it is likely that extensive socialization is required to orient boys and girls to function differently” (Wood & Eagly, 2002, p. 705), and “[the] demise of many gender differences with increasing gender equality is a prediction of social role theory” (Eagly et al., 2004, p. 289).

Several cross-cultural research findings are relevant for evaluating this prediction of social role theory, including patterns of personality sex differences. In almost every case— traits, traits, , subjective well-being, and —sex differences are conspicuously larger in cultures with more egalitarian gender roles, gender socialization, and sociopolitical gender equity (Schmitt et al., 2017). Strikingly similar results are found for personal values such as benevolence and power (Schwartz & Rubel-Lifschitz, 2009), personal preferences such as and (Falk & Hermle, 2018), and for more objectively-measured sex differences in cognitive abilities (e.g., mental rotation, mental location) and physical traits (e.g., height, blood pressure; see Schmitt, 2015). Social role theory, as a theory that explains the degree of psychological sex difference across cultures, seems woefully not up to the task. In contrast, several evolutionary theories—sexual selection theory, life history theory, sex ratio theory, strategic pluralism theory, psychosocial acceleration theory, mismatch theory, and others that combine features of facultative adaptation, evoked culture, phenotypic plasticity, and reaction norms—have proven useful at predicting and explaining cross-cultural variations in the size of psychological sex differences (see Schmitt, 2015).

Summary of Counterpoint 3: Age-related, cross-cultural, and contextual variability are real and important. However, evolved sex differences need not be fixed and context-independent. Many evolved sex differences are expected to be expressed only at some life stages or to vary in systematic ways according to social and ecological factors.

Counterpoint 4: Where Do Differences Come From?

Fine and colleagues consider how sex differences in the brain and behavior develop. They state that “a common assumption is that the differences are caused at least in part by genetic and hormonal differences between the sexes.” In fact, it is always the case that sex differences in the brain and behavior are due in part to genetic and hormonal differences, but this can happen in several ways.

One way is via the direct influence of sex hormones on brain development. Androgens (including ) and estrogens circulate in the bloodstream to their target tissues. Target tissues are simply those in which cells produce receptors for the hormone—and animal brains are full of sex hormone receptors. Sex hormones enter cells where they bind to their receptors, and the hormone-receptor complex then binds to DNA to regulate expression and direct development along sexually differentiated patterns.

Fine and colleagues rightly point out that sex hormones (and sex chromosome complements) can also engender sex differences through less direct routes. One such route is through affecting other traits or processes that themselves directly influence the brain and behavior. Alternatively, genetic and hormonal factors may affect a person’s sensitivity to aspects of —ranging from sensitivity to painful stimuli (which show robust sex differences; Dance, 2019) to preferences for stereotypically male or female toys and activities, male or female playmates, and even role models (e.g., Hines et al. 2016; Pasterski et al., 2011). Thus, even when boys and girls have the same toys, activities, and people available in their environment, they tend to choose different ones to focus on and interact with (Todd et al., 2018). And of course, the different experiences that boys and girls have as a result can produce or amplify other sex differences (for example in motor and cognitive abilities such as location , mental rotation, or manual dexterity).

Perhaps more obviously, sex hormones and sex chromosomes influence outward appearance, and this affects treatment by parents and others. Naturally, this includes social expectations about gender roles and behaviors. Fine and colleagues seem to assume that those expectations are “socially constructed,” but instead they may derive—at least in part—from a long history of correct observations about the typical behavior of males and females (for discussion of how “gender stereotypes” tend to be largely accurate, see Jussim et al., 2015; Löckenhoff et al., 2014).

The effects of gendered socialization are likely to vary considerably across behavioral and psychological dimensions, but the evidence indicates that those effects are often weaker than one might suppose (Udry, 2000). For instance, although parents’ stereotyped beliefs (e.g., boys are more agentic, girls more nurturing) seemed to modestly influence their children’s own stereotyped beliefs, these effects did not extend to the children’s sex-typed interests, such as toy preferences (Tenenbaum & Leaper, 2002). Of note, meta-analyses of the literature do not find much evidence that parents treat boys and girls in markedly different ways (Endendijk et al., 2016) or that gender equality at the level of countries impacts sex differences in toy preferences (Todd et al., 2018).

Moreover, one cannot simply assume that differences in variables such as “parental socialization; hobbies; educational background; or stereotypical beliefs about fixed male or female aptitudes, abilities, or roles” are entirely determined by environmental forces (Pirlott & Schmitt, 2014). For example, gendered interests and hobbies are themselves influenced by genetic factors and early exposure to sex hormones (e.g., Loehlin et al., 2005; Manning et al., 2017).

Fine and colleagues caution against the over-generalization of animal models to humans—a fair point, as we discuss below (Counterpoint 7). But a bedrock concept of modern science is parsimony. It is critical to ask why many behavioral sex differences in humans are so similar to those of many other species, none of which has gender socialization regimes like ours. If humans share with other mammals many sex differences in behavior, and if sex hormones play critical roles in organizing these sex differences and their underlying neural architecture in every mammal in which experiments have so far been conducted, then is it not likely that sex hormones do so in humans as well? Although the types of experiments conducted in nonhuman mammals would be unethical or infeasible in humans, we do have access to many so-called “natural experiments” or conditions in which sex hormone action is atypical. These data overwhelming point to sex hormones playing largely similar roles in humans and other mammals.

Consider the example of . In nonhuman animals, same-sex sexual behavior can be induced by the experimental manipulation of sex hormones. In humans, one can wonder whether sex hormones have their effects directly by influencing parts of the brain related to partner preferences, or indirectly by producing an outward appearance that contributes to specific patterns of socialization. For instance, women with complete androgen insensitivity syndrome (CAIS) have XY chromosomes and are born with undescended testes, which, in utero, produced androgen levels in the normal-high male range. However, androgen receptors are nonfunctional, and both outward appearance and psychology, including sexual orientation, are female-typical. Are women with CAIS generally attracted to men because of the absence of androgen action on the brain, or because during their development they looked like, and so were socialized as, girls?

To help answer this question, one can compare CAIS women with other people who differed in their androgenization during development but were raised as females. This includes unaffected “control” women, as well as women who were exposed to elevated androgen levels during development due to a condition called congenital adrenal hyperplasia (CAH). And we can include natal males with male-typical prenatal sex hormones whose gender was reassigned as female in infancy (due to a circumcision accident, or to a congenital abdominal malformation called cloacal exstrophy). Of course, in a perfect experiment, assignment to hormonal “treatment” groups would be random, and neither the subjects nor those around them would know these group assignments. But the results of such a comparison are nonetheless illuminating: Among people raised as girls, the probability of attraction to women (gynephilia) in adulthood is directly proportional to the degree of androgenization during early development (Motta-Mena & Puts, 2017). Those whose early androgenization was female-typical, or lower due to CAIS, tend overwhelmingly to be attracted to men. Those whose early androgenization was male-typical but were nevertheless raised as females tend overwhelmingly to be attracted to women. And those whose early androgenization was intermediate due to CAH are intermediate in their partner preferences—in proportion to the severity of their condition:

Source: Motta-Mena & Puts (2017)

Figure 2. The proportion of individuals experiencing gynephilia (sexual attraction to women) as a function of early androgen signaling in individuals raised as females. XY = genetic male, = genetic female. CAIS = complete androgen insensitivity syndrome, Ctrl = women recruited without regard to diagnosis or sexual orientation, NC = non-classical CAH, SV = simple virilizing CAH, SW = salt-wasting CAH, GR = gender-reassigned natal males. Redrawn from Motta-Mena and Puts (2017).

These data suggest virtually no effect of rearing experiences on sexual orientation. This is not to say that androgen action is the only influence on sexual orientation—there is good evidence that it is not (e.g., Bogaert et al., 2018). But the fact that most males are sexually attracted primarily to females and vice versa is almost certainly due to early sex differences in androgens, and probably has very little to do with gender socialization.

To conclude this section, we briefly address Fine and colleagues’ claim that sex differences in brain anatomy “disappear or become trivial” once sex differences in total brain size are accounted for. First, the fact that brain size correlates with body size does not make sex differences in brain size any less real or potentially meaningful. Second, we believe that Fine and colleagues overstate the impact of overall brain size on patterns of sex differences. Many anatomical sex differences are still found once overall brain size is taken into account; moreover, some areas are proportionally larger in females, a fact that becomes apparent after correction (Lotze et al., 2019; Ritchie et al., 2018). Anderson et al. (2019) employed a complex analytic approach to separate anatomical variation into independent components, some of which included areas proportionally larger or denser in females. While this method does not control for differences in brain volume, it supplements them with information about relative variation; the study achieved a PCC of about 93%. In other words, the sex of 93% of the people in this study could be determined based on information on brain structure. In another study, Phillips et al. (2018) calculated a global index of sexual differentiation in the brain that explicitly corrected for total volume; the difference between males and females on that index was d = 1.8 (estimated PCC = 82%), hardly a vanishing effect (more on this in Counterpoint 5).

Summary of Counterpoint 4: The impact of gender socialization is often weaker than many psychologists assume. Moreover, socialization processes can be influenced by genetic and hormonal factors. For these reasons, socialization is not a good default explanation for sex differences. Even accounting for the complexity of hormonal and developmental mechanisms, sex hormones seem to play largely similar roles in humans and other mammals.

Counterpoint 5: How Does It All Add Up?

This section of Fine and colleagues’ article makes two points on which we agree: First, it is important to ask how differences across multiple traits “add up”; and second, sometimes the effects of two or more differences can compensate one another (at least in some contexts). One way to integrate sex differences across traits is to use multivariate effect sizes, to complement the usual focus on univariate differences (Del Giudice, 2009, 2019; see Del Giudice, 2013 for detailed answers to criticism). For example, Mahalanobis’ D is the equivalent of Cohen’s d in more than one dimension and has the same interpretation when it comes to overlap, classification accuracy, and so on (see Figure 1 above).

Multivariate indices like D show that “small” differences in many correlated traits (for example different aspects of personality, or preferences for various desirable characteristics of sexual partners) can add up to a large statistical separation between the sexes within a given psychological space. This is the case of occupational interests (D = 1.6 in Morris, 2016), mate preferences (average D = 2.4 in Conroy-Beam et al., 2015), and personality (D = 2.7 in Del Giudice et al., 2012). (Note: these studies use different techniques to correct for measurement error; see Counterpoint 2). As far as we know, these indices have yet to be used in brain research. A recent study (Phillips et al., 2018) estimated a global index of sex differentiation in brain anatomy, and the uncorrected d was about 1.8 (calculated in Del Giudice, 2019). This implies an overlap of ~37% in brain anatomy or a probability of 90% that a randomly picked male will show a more male-typical anatomical pattern than a randomly picked female.

As a useful analogy to understand how this works, consider sex differences in facial appearance. Just as with behavior, males and females are not very different if one examines one trait at a time—there is a lot of overlap in the size of the eyes, the thickness of the eyebrows, or the width of the nose. But the total effect of many such differences is to make male and female faces so distinct that, despite a lot of individual variation within each sex, the observer can guess the sex of a person from his or her face more than 95% of the time (Bruce et al., 1993).

Fine and colleagues are not particularly interested in this approach to aggregation. Their focus is not on general patterns but on variability and exceptions; the view they advocate is that, when it comes to differences in brain and behavior, most people are male-female “mosaics”. The term “mosaic” is memorable but vague and can be used to imply a number of different things—some correct, others less so. If it is used as a reminder that distributions overlap and there are individual differences within each sex, then we have nothing on which to object. In another context, one of us has argued that the profile of behavioral and cognitive traits that characterizes the spectrum—a strongly male-biased collection of disorders—can be usefully viewed as a mosaic of male- and female-typical features (Del Giudice, 2018).

However, we disagree when Fine and colleagues stretch this idea to imply that one cannot speak of male and female “natures” (or brains) in a statistical sense. The studies they cite to support their mosaic view are based on a crude statistical procedure (“internal consistency analysis”) that simply doesn’t work as advertised, as some of us have shown in detail (Del Giudice et al., 2015, 2016; Del Giudice, 2019; see also this Sexual Personalities post). Researchers are well advised to avoid this procedure, and readers should be skeptical of studies that employ it. Faces are also “mosaics” of features with lots of individual variation: if Fine and colleagues were right, it wouldn’t make sense to talk about male and female faces, despite the obvious fact that they are readily distinguished (and can be easily sorted from more masculine to more feminine within each sex).

Notably, it is now possible to correctly identify a person’s sex from brain scans from 80% to more than 90% of the time, depending on the methods used and the resolution of the data (see Del Giudice, 2019). To describe these figures as “accuracy above chance”, as Fine and colleagues do in the article, is quite an understatement. In fact, these findings suggest that male and female brains may be almost as anatomically distinct as male and female faces, and as noted can be identified with high accuracy—a feat that would not be possible if the “brain mosaic” hypothesis were correct. Of course, the functional differences between male and female brains need not mirrored by anatomical differences; regions that look the same in brain scans can still show different patterns of activity and gene expression. All sex differences in behavior—regardless of their origin— ultimately must correspond to some differences in the brain, either at the macroscopic or microscopic level.

Summary of Counterpoint 5: A multivariate approach to sex differences complements and extends the standard focus on univariate effect sizes, and reveals how many smaller differences can add up to substantial effects. Each sex contains a lot of variation and meaningful exceptions to general patterns, but the strong version of the “mosaic” hypothesis is based on questionable methodology and almost certainly false.

Counterpoint 6: What Does a Brain Difference Mean?

We fully agree with Fine and colleagues: Interpreting brain data is hard, and one should be very careful in jumping to conclusions about behavior from differences in brain anatomy or activation patterns. That does not mean that we should not do our best to integrate results from with those from behavioral studies. Also, we urge caution in attributing differences in scientific views and hypotheses to the researchers’ presumed “stereotypes”.

Counterpoint 7: Animal Comparisons

Studies of and physiology can be invaluable, but—as rightly noted by Fine and colleagues—drawing comparisons and analogies between different species is far from a trivial endeavor. Many neurobiological mechanisms and behaviors are conserved among related species (or have evolved in the same direction because of similar selection pressures); and yet, the particular ecology and evolutionary history of each species make it distinctive, at least in some respects. In humans, some traits are unique (e.g., complex language), some are shared with a number of other animals (e.g., ), and some are widely conserved across species (e.g., aggression, , pain). However, even the basic traits that we share with a large number of other species have been shaped and modified by our specific ecology.

In short, we fully agree that comparisons across species must be done with due care and sophistication. But the only way to know which comparisons make sense, which is more or less likely to be informative, and how to correctly interpret the empirical data is to use the tools and knowledge base of evolutionary biology. To take just one example, consider the evolution of human families and parenting. Clearly, it is important to study the social and parental behavior of other primates (see Chapais, 2009 for a detailed analysis). But a lot of valuable evidence comes from other group-living mammals and—less intuitively—the many species of birds that have evolved systems of shared parental care (e.g., Burkart et al., 2017; shared parental care is rare in mammals but very common in birds). Similarly, birds can be just as useful as primates to investigate the factors that explain human brain evolution (e.g., Dunbar & Shultz, 2007; Emery et al., 2007; Isler & Van Schaik, 2014). Our point is that an evolutionary background is essential to draw meaningful cross-species comparisons; arguably, some limitations of research on “animal models” (e.g., mice) of human traits and diseases stem precisely from neglect of evolutionary considerations (e.g., Perlman, 2016). In light of this, we are puzzled by Fine and colleagues’ dismissive attitude toward evolutionarily informed work (Counterpoint 8)... they correctly diagnose the disease but turn away from the cure.

Fine and colleagues stress that sex-related behaviors in animals show “a wide diversity of patterns”. This is true as far as it goes but might seem to imply that variation across species is random and unpredictable (to the point that researchers can just pick whatever species will happen to reinforce their stereotypes). On the contrary, variation in sex-related behavior is systematic and predictable; there are powerful theoretical tools (most notably sexual selection and life history theory) that help explain the recurrent patterns as well as the exceptions. A great illustration of the lawful nature of this variability comes from a meta-analysis by Janicke et al. (2016):

Source: Janicke et al. (2016)

Figure 3. Sex-biased sexual selection across species. The three columns show different indices of sexual selection; positive values (to the right of the vertical lines) indicate stronger selection in males. (Janicke et al. 2016).

This study synthesized evidence from dozens of species (including humans) and showed that—as predicted by theory—males tend to undergo stronger sexual selection than females (that is, their reproductive success is more variable, with a stronger relationship between mating and reproductive success). Moreover, sex differences in sexual selection predict which sex invests more in parenting (typically females) and which produces more elaborate sexual displays (typically males). Of course, there are exceptions to this pattern (as can be seen in the figure), but they are not random and can be explained by the particular ecologies and mating systems of “role-reversed” species.

In sum, there is a powerful and extensively studied set of evolutionary processes that can explain patterns of sex differences (and similarities) in sexually reproducing species (see Geary, 2010). The differences themselves can vary from one species to the next, but the evolutionary pressures (e.g., competition for mates) and the proximate mechanisms (e.g., sex hormones) that contribute to their emergence are largely the same. A superficial emphasis on cross-species variability can easily miss the central point: Patterns of sex differences across species are not random and disjointed, they are systematic and unified by evolutionary theory.

Summary of Counterpoint 7: Sex-related behaviors are highly variable across species, but this variability is systematic and predictable. Evolutionary biology provides powerful tools (including sexual selection theory) that help researchers make sense of cross-species comparisons and interpret the empirical data in meaningful ways.

Counterpoint 8: Evolutionary Explanations

In the rest of their article, Fine and colleagues raise a number of sensible, if one-sided, points about sex differences research. Unfortunately, in the last section, they rehash some of the oldest and laziest arguments against . From their summary, one would believe that evolutionary work in psychology is entirely speculative, based on a rigid genetic determinism, and amounts to little more than a post-hoc of present-day stereotypes. In their view, the only kind of evidence that would support evolutionary hypotheses is more or less impossible to obtain—so case closed. Of course, if the human brain and behavior are the products of biological evolution, giving up on all evolutionary explanations because the evidence is less than perfect means that we will never reach an accurate understanding of sex differences (or, for that matter, most psychological traits).

Sexual selection is a powerful explanation of sex-related behavior across species (Counterpoint 7); in combination with other evolutionary concepts, it can help address many of the questions identified in the rest of Fine and colleagues’ article: When do we expect sex similarities versus differences? What species are useful candidates for comparison? Why are some sexually dimorphic traits expressed differently in different contexts and life stages? And so on. Brushing it off as “baseless speculation” may not be the best way to make progress.

To illustrate their criticism, the authors describe a study of sexual imprinting in lambs and goats. They seemingly imply that no evolutionary psychologist would ever consider this kind of non-genetic explanation for the development of sexual preferences. In reality, sexual imprinting has been investigated in several studies as a possible mechanism in humans; as it turns out, the evidence that it actually happens in our species is very weak (Rantala & Marcinkowska, 2011; Zietsch et al., 2011). Mathematical models show that imprinting is likely to evolve in some ecologies but not others, depending on its costs and benefits (e.g., Chaffee et al., 2013; Gómez-Llano et al., 2016). (Fine and colleagues also use terms like “heritable” and “inherited” in a somewhat confusing way, and appear to wrongly suggest that evidence of heritability—that is, genetic variance—is required to conclude that the development of a trait is under genetic control.)

A detailed defense of evolutionary psychology would take a much longer post, and the criticisms raised by Fine and colleagues have been addressed repeatedly over more than thirty years (see Andrews et al., 2002; Durrant & Haig, 2001; Ketelaar & Ellis, 2000). Our main suggestion to readers unfamiliar with evolutionary psychology is to take a look at some of the actual work in this area, and not simply rely on second-hand accounts by critics. For an engaging introduction, we recommend the recent book by Stewart-Williams (2018), The Ape that Understood the Universe: How the Mind and Culture Evolve. For a thorough analysis of human sex differences from the standpoint of sexual selection, see Geary (2010). For readers with an academic background, the state of the art is summarized in the handbook edited by Buss (2016). For a brief introduction to how hypotheses are formulated and tested in evolutionary psychology, see Lewis et al. (2017) and Machery (2011).

Evolutionary hypotheses inevitably refer to the past, and they are hard or impossible to test directly with a single type of study. For this reason, the ideal approach is to collect convergent evidence from a host of different sources—from mathematical models and cross-species comparisons to cross-cultural and physiological studies (Schmitt & Pilcher, 2004). The point is summarized in this figure:

Source: Schmitt & Pilcher (2004).

Figure 4. Sources of evidence that can be used to evaluate adaptive hypotheses. (Schmitt & Pilcher 2004).

Like every other discipline, evolutionary psychology has given rise to a multitude of theories and hypotheses—spanning the whole range from solid and well supported, to tentative and speculative, to implausible or just plain wrong. We conclude our rejoinder by linking some synthesis papers on the evolution of sex differences in aggression: A classic review by Archer (2009); two papers by Puts (2010) and Hill et al. (2017) that focus on aggression in males; and three papers by Campbell (1999), Cross and Campbell (2011), and Campbell (2013) that deal specifically with female aggression. What all these papers show is how evolutionary hypotheses can be evaluated and refined by integrating multiple sources of evidence. These papers—and many others like them—also illustrate the clarifying light that a well-developed theory can shine on human sex differences and in doing so provide coherence to human development, behavior, and that otherwise could never be achieved.

Summary of Counterpoint 8: There is much work to be done on the evolution of human sex differences. This work will involve multiple sources of evidence including rigorous and appropriately situated cross-species comparisons, cross-cultural surveys, and lab-based neuroscience. Let’s get to it, let's do it right, and let's interpret future empirical findings with proper balance and nuance.