Neven Sesardic is professor of philosophy at Lingnan University, Tuen Mun, NT, Hong Kong; [email protected]. He is the author of Making Sense of Heritability (Cambridge University Press, 2005).

Rafael De Clercq is associate professor and head of the Department of Visual Studies, and adjunct associate professor of philosophy, Lingnan University, Tuen Mun, NT, Hong Kong; [email protected].

The authors thank these individuals for useful comments on earlier drafts: Tomislav Bracanović, Stephen J. Ceci, Andrew Irvine, Paisley Livingston, Darrell Rowbottom, Nosson ben Ruvein, David N. Stamos, Matej Sušnik, Omri Tal, Daniel Wikler, Wendy M. Williams, and Jiji Zhang. None of these people, however, should be assumed to agree with the main claims of this paper.

Editor’s Note: This is the complete version of an article with the same title adapted for the Winter 2014 Academic Questions (vol. 27, no. 4)

A number of philosophers attribute the underrepresentation of women in philosophy largely to bias against women or some kind of wrongful discrimination. They cite six sources of evidence to support their contention: (1) gender disparities that increase along the path from undergraduate student to full-time faculty member; (2) anecdotal accounts of discrimination in philosophy; (3) research on gender bias in the evaluation of manuscripts, grants, and curricula vitae in other academic disciplines; (4) psychological research on implicit bias; (5) psychological research on stereotype threat; and (6) the relatively small number of articles written from a feminist perspective in leading philosophy journals.

In each case, we find that proponents of the discrimination hypothesis, who include distinguished philosophers in fields such as philosophy of science, metaphysics, and philosophy of language, have tended to present evidence selectively. Occasionally they have even presented as evidence what appears to be something more dubious—for example, studies supporting the discrimination hypothesis based on data that have been reported “lost” under suspicious circumstances.

It is not the aim of this paper to settle the question of the causes of female underrepresentation in philosophy. Rather, we argue that, contrary to what many philosophers claim, the overall information available does not support the discrimination hypothesis.

From Gender Disparity to Discrimination?

The Canadian Philosophical Association passed the following recommendation at its Annual General Meeting in 1992: “In any decade in any department, at least fifty percent of new permanent or probationary positions should be filled by women.”[1] In some situations, new employments were even supposed to exceed the prescribed minimum quota of 50 percent women; an additional proviso stated that “if achieving [the set goal of] twenty-seven percent female faculty by 2000 requires a hiring rate for women that is higher than fifty percent, the higher rate should be implemented.”[2] The department of philosophy at the University of Toronto went further, setting a goal of hiring at least two-thirds women, and it even managed to exceed that quota in carrying out its five-year plan.[3]

In the report in which the Canadian Philosophical Association proposed these measures it is claimed that “there is compelling evidence that philosophy’s gender imbalance is the source of bias and partiality in many of its theoretical products and that a better representation of women would help to rectify these shortcomings.”[4] However, no reference was given to any source of the alleged “compelling evidence.”

Obviously, in the absence of additional evidence, mere information about percentages is insufficient to prove “bias and partiality”: the percentage of women among philosophy professors (or among the recent hires) being much lower than 50 percent does not, in itself, imply that there is a bias against women in the hiring process. It may even be that the process is actually biased against men.

In her 2008 article “Changing the Ideology and Culture of Philosophy: Not by Reason (Alone),” MIT professor of linguistics and philosophy Sally Haslanger provides a table with percentages of women among the faculty of the top twenty graduate programs in philosophy in the U.S., ranging from 4 percent to 36 percent, and concludes that “the data mostly speak for themselves.”[5] But again, the data don’t speak for themselves at all. Without additional information, it is impossible to draw any conclusion.

To establish even a prima facie case for anti-woman hiring bias, it would be necessary to first compare the percentage of women among the job applicants with the percentage of women among job recipients. Only if the latter percentage is lower than the former is there prima facie evidence for hiring discrimination against women.

University of British Columbia professor of philosophy Andrew Irvine compared exactly these percentages in the Canadian academic job market over a twenty-year period ending in 1996.[6] According to his estimations, the percentage of female job recipients was on the whole higher than the percentage of female job applicants, which led him to conclude that “if systemic discrimination is occurring within contemporary university hiring, it is more likely to be occurring in favor of, rather than against, women.”[7]

In 2002, Doreen Kimura, a Canadian psychologist and professor at Simon Fraser University, surveyed thirty-six schools and departments in total at two major Canadian universities and confirmed Irvine’s finding: of the three most recent hires to fill positions in each school or department studied, women represented 29 percent of the total number of applicants, but 41 percent of all individuals hired.[8]

Trying to resolve the same issue, psychology professor Clive Seligman went over the data on academic hiring at the University of Western Ontario from 1991–1992 to 1998–1999, and reached this conclusion:

Over the 8 years, on average: 5.4% of female applicants were appointed compared to 2.9% of male applicants; 21.7% of female applicants were interviewed compared to 15% of male applicants; and 24.9% of female applicants who were interviewed were hired whereas 19.2% of men who were interviewed were appointed. Again, the results in each of the years are remarkably consistent. Women had almost twice the chance of being hired as did men.[9]

Similar results were obtained in a recent comprehensive study commissioned by the U.S. Congress to assess gender differences in the careers of science, engineering, and mathematics faculty—the area with the highest underrepresentation of women.[10] Conducted under the auspices of the National Research Council, Gender Differences at Critical Transitions in the Careers of Science, Engineering, and Mathematics Faculty included two surveys of major research universities, focusing on almost five hundred departments and more than eighteen hundred faculty members.

The authors reported that among those interviewed for tenure-track or tenured positions, the percentage of women interviewed was higher than the percentage of women who applied for those positions, and that tenure-track women in all disciplines received a percentage of first offers that was greater than their overall percentage in the interview pool.[11] The situation was the same with tenured positions in all disciplines except biology.

So we find a pattern according to which there are more women, percentage-wise, at a later stage than at an earlier stage throughout the hiring process—which is exactly the opposite of what one would expect if there were discrimination against women.

But why are there fewer women than men at the application stage in the first place? Could this be a result of discrimination? It could, but evidence is needed to support this hypothesis. Moreover, assuming that the situation is really so inhospitable for women in the academic job market, it would be odd if women were first discouraged from applying for academic jobs, only to be favored over men once they submit an application. There is room here for different explanations, including a theory that does not posit discrimination.

There are at least two different nondiscrimination scenarios in the literature that have been offered to account, at least in part, for the lower percentage of women in some academic fields: gender differences in abilities and gender differences in interests.

The first factor is a possible statistical difference between the sexes in the mental abilities that are of key importance for success in those fields. For instance, a set of findings that has been replicated a number of times shows that “males score higher on some tasks that require transformations in visual-spatial working memory…and fluid reasoning, especially in abstract mathematical and scientific domains.”[12] Several meta-analyses yield the conclusion that there are large gender differences in mental rotation and mechanical reasoning favoring males, “which some have suggested underlie sex differences in advanced math.”[13] Furthermore, due to greater male variability there are proportionately more males than females at either tail of the distribution of abilities, which mathematically entails (assuming normal distribution) that the male/female ratio will rise ever more steeply as we move to the pool of people with higher and higher abilities. For example, among seventh graders scoring a perfect 800 in the mathematics component of the SAT between 2006 and 2010, there were 6.58 males for every female.[14] The relevance of this statistical effect for our discussion is obvious, given that academics in exact sciences are recruited from those with exceptionally high mathematical abilities. It is also relevant for the situation in philosophy, because the underrepresentation of women seems to be most pronounced in the more technical areas of philosophy such as logic, decision theory, and philosophy of mathematics.[15]

The second nondiscrimination factor that might be responsible for the gender disparity in some academic fields is the male-female difference in interests. According to a recent meta-analysis of the relevant research, there is indeed a large difference between men and women along the things-people dimension, with men more interested in things-oriented careers, in contrast to women, who tend to prefer people-oriented careers. Also, “men generally showed more Realistic and Investigative interests as well as stronger interests in the STEM areas; in comparison, women tend to have more Artistic, Social, and Conventional interests and to express less interest in the STEM fields.”[16] Other studies also report “very large differences in gender-related interests.”[17] And a similar claim about the male-female difference in interests is defended by Simon Baron-Cohen, professor of developmental psychopathology at the University of Cambridge and fellow of Trinity College, in The Essential Difference: The Truth about the Male and Female Brain.[18]

We don’t want to commit ourselves to defending any of the alternative potential explanations of the gender gap that are the subject of scholarly study at present. Our aim is solely to urge the reader to resist the call to accept the discrimination hypothesis without evaluating the totality of evidence bearing on this complex question.

To see how the mere numerical fact of gender disparity—unaccompanied by any understanding of the larger context—can move prominent philosophers to rush to a conclusion and galvanize them into urgent action unjustified by the facts, consider the following sequence of events. On June 19, 2013, Kieran Healy, a Duke University associate professor of sociology, published data on his blog showing that out of all recent citations in four prestigious philosophy journals, female authors comprise just 3.6 percent of the total.[19] Although Healy warned that “this is exploratory work” and that there are unanswered “questions about the underlying causes of any patterns that show up in the data” as well as “various comparisons that sound straightforward…but are actually quite complicated to answer properly, or imply a lot more data collection and analysis than I can do here,” Edward Zalta and Uri Nodelman, the primary editors of the Stanford Encyclopedia of Philosophy (SEP), soon learned about Healy’s data and decided the issue needed immediate attention. On July 12, Zalta and Nodelman sent an email with the subject “SEP request concerning citations” to all SEP authors, subject editors, and referees. The email included a link to Healy’s data and informed SEP collaborators that the editors take the issue of “undercitation” of women philosophers seriously. Although Zalta and Nodelman neither explained why the issue is so pressing nor clarified their objective (besides pushing some numbers up), they “encourage[d] our authors, subject editors, and referees to help ensure that SEP entries do not overlook the work of women or indeed of members of underrepresented groups more generally.” Furthermore, they urged collaborators to write to the editor “any time [they] notice a source missing from an SEP entry (whether or not it is [your] own entry).”

There are five problems here. One, Zalta and Nodelman seem to assume, without providing any evidence, that the “undercitation” of women is at least partly the result of bias, i.e., the tendency of philosophers to “overlook” women’s publications more often than men’s. Two, the way Zalta and Nodelman try to address this undercitation resembles an attempt to cure a disease without knowing its cause. Three, their action will have a perverse effect: de facto nudging many scholars to cite more female philosophers—and to report on those who fall behind in this task—may distort genuine citation patterns in the discipline and undermine the integrity of a bibliometric analysis of philosophical publications. Four, there might be another perverse effect: If SEP’s initiative to boost the citation of women’s publications becomes more widely adopted within philosophy, then philosophers who do not believe that the “undercitation” is due to sexist bias might react by correcting for what they perceive as citation inflation for a select group. As a consequence, these philosophers might start to interpret the number of citations of female philosophers as being, on average, a less reliable sign of scholarly quality than the number of citations of male philosophers. And five, it should be expected that other demographic groups would soon follow suit and demand that their “unfairly” low citation rate be similarly augmented.

Anecdotal Evidence

Sally Haslanger claims there was “a lot of outright discrimination” when she was a student, and that “blatant discrimination has not disappeared.”[20] As evidence of discrimination, Haslanger cites, for example, “occasions when a woman’s status in graduate school was questioned because she was married, or had a child (or had taken time off to have a child so was returning to philosophy as a ‘mature’ student), or was in a long-distance relationship”; and “many women who have interests and talents in metaphysics and epistemology who have been encouraged to do ethics or history of philosophy.”[21]

Needless to say, it is not our intention to deny that instances of discrimination against women in philosophy occur. For all we know, there is occasional discrimination against individuals belonging to all sorts of groups, including conservative philosophers, philosophers with a degree from less prestigious institutions, philosophers working in “marginal” areas, philosophers who are not native speakers of English, etc. After all, there is evidence pointing to discrimination against men as well (see above), although this phenomenon is rarely discussed.

What we want to question is whether this sort of anecdotal evidence supports the view that discrimination against women in philosophy is pervasive and systematic. Haslanger writes that philosophy departments “mostly” do not provide “a good working environment with mutual respect” and that “it is very hard to find a place in philosophy that isn’t actively hostile toward women and minorities, or at least assumes that a successful philosopher should look and act like a (traditional, white) man.”[22] Similarly, University of Massachusetts, Amherst, professor of philosophy Louise Antony asserts that “the discipline of philosophy marks the site of a unique convergence, intensification, and interaction of discriminatory forces.”[23]

In the evaluation of these claims, how much weight should be assigned to subjective impressions and charges, often anonymously made against unidentified male philosophers? For example, how much weight should be assigned to the reports about “men behaving badly” in philosophy found on the What Is It Like to Be a Woman in Philosophy? website,[24] which is cited as a source of evidence by Helen Beebee and Jenny Saul, of the University of Manchester and the University of Sheffield, respectively, among other scholars?[25]

First of all, there is an obvious danger of a self-selection effect here. The website declares its agenda “to do something about the situation of women in philosophy,” which obviously presupposes that the situation is bad and that a pressing need to change it exists. Therefore, people with negative experiences will be more likely to share their stories than those who see or encounter no major problems and who, consequently, have little or nothing of significance to report. This will tend to create a distorted picture of the position of women in philosophy, in the same way that a website entitled “What Is It Like to Be a Conservative in Philosophy?” probably would.

Another problematic aspect is the proclaimed policy of What Is It Like to Be a Woman in Philosophy? that “any negative stories should be told without any identifying information.”[26] Potential contributors are instructed: “Please anonymise your story as far as possible, especially if it is negative.”[27] This does not exactly inspire confidence in the truthfulness of the stories submitted, or the potential for verifying their accuracy. Therefore, increased caution is advised, especially because, thus far, those who run the website have said nothing about whether—or how—they try to corroborate the reports they receive from readers. An additional worry is that in one case (known to us), the submission of a completely fabricated story was promptly published on What Is It Like to Be a Woman in Philosophy?—apparently without any independent verification.

It is rarely mentioned that the goal of greater numbers of women in philosophy as a profession can be undermined if the situation is systematically represented as being far worse than it really is. Many of the women who have talent and a strong interest in philosophy are likely to get cold feet if they hear prominent philosophers expressing their rage about how poorly women have been treated in their discipline, which is alleged to be riddled with discrimination, sexism, and bigotry. If a potential philosophy scholar swallows these horror stories (and why shouldn’t she?), pursuing the love of wisdom would hardly remain her first career choice.

If we are right, however, that this dark picture of philosophy has not been confirmed by the facts, then exaggerated, repeated, and unchallenged claims about bias against women in philosophy will probably result in many intelligent and able young women avoiding any encounter with what they will see, unjustifiably, as an academic slum of irrationality and hatefulness.

Does Research Confirm Gender Bias in Academia?

Beebee and Saul refer to a study by Amber E. Budden et al. that reported a 7.9 percent increase in female first authors in Behavioral Ecology in the four years that followed the adoption of double-blind review by the journal, which is supposed to suggest that there is gender bias favoring male authors.[28] However, questions have been raised about this conclusion drawn from the Budden et al. study. For example, it has been observed that, in the same period, other ecology journals also published more papers by women without switching to double-blind review, which suggests that the increase might have been due to an increase in submissions by female authors.[29] In response, Budden et al. wrote that their “study was observational and that the changes occurring at the journal where double-blind review was introduced might be due to alternate variables.”[30] Moreover, other studies by Budden et al. did not confirm the existence of gender bias.[31] Significantly, Nature almost immediately retracted its earlier report on the 2008 Budden et al. study in the following way:

After re-examining the analyses, Nature has concluded that ref. 1 [Budden et al.] can no longer be said to offer compelling evidence of a role for gender bias in single-blind peer review. In addition, upon closer examination of the papers listed in PubMed on gender bias and peer review, we cannot find other strong studies that support this claim. Thus, we no longer stand by the statement in the fourth paragraph of the Editorial, that double-blind peer review reduces bias against authors with female first names.[32] (Italics added.)

In their review of the literature, Stephen J. Ceci and Wendy M. Williams conclude that “[t]he preponderance of evidence, including the best and largest studies, indicates no discrimination in reviewing women’s manuscripts.”[33] In the light of these facts, Beebee and Saul should not continue citing the Budden et al. study as authoritative.

How about grant applications? Haslanger and Saul cite a study by Christine Wennerås and Agnes Wold that is supposed to have shown that “women needed to be 2.5 times as productive as men to get a grant [from the Swedish Medical Research Council].”[34] The Wennerås and Wold study made a big impact although it was based on a rather small sample of only 114 applications submitted for postdoctoral fellowships that were to be offered in 1995. Oddly enough, it is rarely mentioned that just six months after their article appeared, Nature published a study by Jonathan Grant et al. that relied on much more comprehensive evidence. The authors looked at 1,741 grant applications to the Wellcome Trust and 1,126 grant applications to the Medical Research Council (in the UK). They concluded that “this study has shown no evidence of discrimination against women.”[35]

More recently, Ulf Sandström and Martin Hällsten investigated 280 grant applications submitted to the Swedish Medical Research Council in 2004.[36] Their conclusion is that “female principal investigators receive a 10% bonus on scores.”[37] More generally, Ceci and Williams report that “the weight of the evidence overwhelmingly points to a gender-fair review process” in grant funding.[38] Their conclusion is based on a number of smaller studies from different countries (including the abovementioned study by Grant et al.) as well as on six large-scale studies, including one by Herbert W. Marsh et al. that “found no significant gender differences in peer reviews of grant applications.”[39] A similar conclusion was reached more recently by two primary studies: one by Marsh et al., which focuses on Australian grant applications, and one by Ruediger Mutz et al., which focuses on Austrian grant applications.[40] Together these recent studies involved more than 30,000 reviews of grant applications, and neither found clear evidence of sex discrimination.

Given that so many large-sample studies fail to confirm bias against women, it is puzzling that some scholars (e.g., Haslanger and Saul) defend the discrimination hypothesis by placing so much emphasis on the small-sample research by Wennerås and Wold. There is another, somewhat eyebrow-raising problem with the Wennerås and Wold article that Haslanger and Saul fail to mention. The data on which the study was based were inexplicably lost, which made it impossible for other researchers to reanalyze the data and check the conclusions. In response to psychologist James Steiger’s request for the data, Wold explained: “They were in a computer of a guy at the Statistics department and I got them on a diskette many years ago and I am afraid I will not be able to find it anymore.”[41] Scholars do not usually publish an article in Nature and then later lose the data diskette without bothering to back up the all-important files on which such high-impact research is founded.

Beebee, Haslanger, Carole Lee, Christian Schunn, Jesse Prinz, and Saul cite a study by Rhea E. Steinpreis et al. that involved sending CVs to two groups of psychologists: one group received the CV with a male name, the other group received the same CV with a female name.[42] The psychologists tended to judge the CV more favorably when they had received it with a male name—although the difference disappeared when participants took the candidate to be a tenure applicant rather than a job applicant. Corinne A. Moss-Racusin et al. obtained a similar result in a study involving participants from the natural sciences (physics, chemistry, and biology).[43]

These results should strike one as surprising: Why does there appear to be gender bias in the evaluation of CVs, but not in the evaluation of grant applications? Whatever the explanation, the difference in sample size between the two kinds of research is a reason to place more weight on the latter: the grant studies were conducted on a much larger scale, involving thousands (e.g., 6,000–18,000 in the cited primary studies) of reviewers instead of the few hundred (127–238) that were recruited for the CV studies.

Another reason for placing more weight on the grant studies is that they are based on real-life data, whereas the CV studies are reporting results obtained in an experimental set-up. After all, what we want to find out is whether discrimination in real life is widespread. In this connection, it may be worth recalling some of the studies cited above, such as the National Research Council study, which found that women were more likely to be interviewed and to be offered jobs than men. Obviously, this is not the result one would expect if there were a tendency to favor male CVs. To be sure, the result is not inconsistent with the CV studies, because there is usually an explanation for why real-life data differ from experimental findings. For example, the experiments on which the CV studies are based have an artificial feature that may have affected the responses: participants were asked to evaluate a single (male or female) candidate, whereas real-life hiring normally involves comparisons between (male and female) candidates.

On the other hand, believers in discrimination may want to explain the difference between the experimental studies and the real-life data in terms of the quality of CVs: perhaps, in real life, female applicants have significantly better CVs than male applicants and are therefore more likely to be interviewed and hired. However, this kind of explanation also needs evidence. Moreover, there is some evidence to the contrary. For example, the productivity data collected by Ceci et al. suggest that on average men publish significantly more than women in STEM fields, at least at the ranks of assistant and full professor, and that they hold more patents.[44] For the time being, then, it seems better not to draw conclusions about real-life discrimination on the basis of the aforementioned experimental studies alone.

Implicit Association Tests

An explanation of the underrepresentation of women that many philosophers (Antony, Beebee, Haslanger, Margaret Crouch, Saul) find compelling is “implicit bias”: unconscious attitudes and beliefs that affect our explicit judgments about women.[45] As evidence of this, Antony, Beebee, Haslanger, Prinz, and Saul cite the study by Steinpreis et al. about the effect of having a female rather than male name appear on a CV.[46] Beebee and Saul cite, in addition, the study by Budden et al. about the effect of having a female rather than male name appear on a paper submitted to a journal.[47] Our previous section, however, has already cast doubt on these two studies. Moreover, neither study controls for attitudes or beliefs about women that are consciously held, so they are at best inconclusive evidence for unconscious bias against women.

Research aimed specifically at uncovering unconscious biases is also cited by Tamar S. Gendler, Haslanger, Saul, and the Implicit Bias & Philosophy International Research Project, which lists over seventy participants from philosophy.[48] In particular, they cite a psychological test that is known as the Implicit Association Test, or IAT.[49] The test is designed to measure the strength of one’s unconscious associations by comparing one’s reaction times in certain classification tasks. Typically, the test involves two related tasks. For example, one is first asked to place items (e.g., “biology”) into the categories “women or science” or “man or literature”; subsequently, one is asked to place items into the categories “women or literature” or “man or science” (of course, the order can be reversed). If one is faster in placing the correct items into the category “women or science” (first task) than into the category “women or literature” (second task), then the association between the first two concepts will count as stronger; and likewise for “man or science” and “man or literature.”

Unfortunately, building one’s case for unconscious bias against women on the IAT is risky, because the test has been the subject of serious and ongoing controversy. Roughly, concerns about the IAT are raised on three counts:

measurement assumptions (for example, the way in which differences in test scores are supposed to correspond to differences in association strength between concepts such as “women” and “science”) possible confounders (perhaps the difference in reaction time can be explained by factors that do not imply bias or prejudice) predictive value (i.e., whether the IAT actually predicts discriminatory behavior)[50]

It is impossible to go into all the arguments for these concerns, but the relatively low test-retest reliability of the IAT should certainly make one wary of its evidential status. For example, according to one of the most prominent advocates of the test, Anthony Greenwald, the average test-retest reliability of the IAT is 0.56.[51] Moreover, Willliam A. Cunningham et al., who also support the IAT, report a test-retest reliability over a two-week period that is as low as 0.27 (although they attribute this partly to measurement error).[52]

Saul and Gendler show some awareness of the controversy surrounding the IAT. In a footnote, Gendler mentions the possibility that the IAT may not reveal a person’s bias or prejudice but rather “ease of access to culturally-encoded associations that she both implicitly and explicitly rejects,” a possibility that connects with concern 2.[53] Also in a footnote, Saul writes:

The Implicit Association Tests (IATs) are not without critics. See, for example, Blanton and Jaccard 2006. But see also the replies in Greenwald et al. 2006, and Jost et al. 2009.[54]

This last quote calls for a few comments. First of all, if a test continues to be subject to controversy, generating widely-cited criticisms that are taken seriously even by the test’s advocates, can one really just side with the advocates, and rely on their assertions about the test without addressing the concerns that have arisen? Imagine someone writing: “I assume that innate gender differences explain why women are underrepresented in science. This theory is not without critics. See, for example, X, but see also the replies in Y and Z.” Taking such a shortcut seems especially unfortunate in the present context; first, because the reader—most likely, a philosopher—cannot be expected to be familiar with the psychological literature on the IAT; and second, because what is at issue is of great practical importance. With the exception of Gendler’s paper, all the papers cited make policy recommendations on the basis of conclusions drawn from IAT research.

Some of these recommendations call for intrusive actions. Saul reports that the chair of the Philosophy Panel of the Research Excellence Framework (the successor of the Research Assessment Exercise in the UK) is “very concerned about implicit bias” and will “attempt to implement” her recommendations.[55] Some of Saul’s own recommendations are quite innocent; for example, leaving enough time to the panel members to make assessments. However, she also mentions—sympathetically, it seems—a recommendation made by Crouch, who suggests that

individual panel members’ ratings could be periodically reviewed for patterns that might indicate bias. If such patterns are found, these could be—carefully and privately—raised with panel members, as reflecting on one’s past biased judgments is an extremely effective way of reducing bias.[56]

It seems to us that such a recommendation is problematic for at least two reasons: first, because it is very hard to infer bias merely on the basis of a statistical pattern, especially when samples are small and when it’s not clear in advance what the “expected” or “fair” allocation among groups should look like in a given situation; second, fear of being “periodically reviewed” could easily influence reviewers to abandon the pure evaluation by merit and adopt an unofficial quota system—just in order to avoid unpleasant conversations and accusations of sexism. All in all, one can only hope that the chair who is “very concerned about implicit bias” will not implement these or similar recommendations while so many open empirical and methodological issues exist concerning IAT research and its potential practical relevance.

In a similar vein, the British Philosophical Association has recently issued the following recommendation: “Departments should make sure that those involved in teaching [and hiring] know about the workings of unconscious bias. Information about and discussion of gender bias should be included in any training or induction sessions run by the department for staff”[57] (italics added). But unconscious bias should not be presented as an established fact: Departments should not make sure that their members “know” about the workings of something that is heavily disputed by many psychologists.

Some universities moved beyond recommendation and have already implemented institutional measures to block the alleged discriminatory effects of implicit associations. For example, Oxford University has established “the Vice-Chancellor’s Diversity Fund,” whose major use is “to address the under-representation of women in senior research and academic posts.” One of “a larger scale approaches” supported by the fund involves “providing specific and widespread training in the avoidance of unconscious bias in recruitment.”[58] Again, given the current disagreements about the nature and implications of implicit associations it seems definitely premature to organize massive training in the avoidance of something that, for all we know, may not exist.

Returning to the literature on the IAT, it’s notable that the 2009 article by John T. Jost et al.—cited by Saul as a rejoinder to IAT critics—was not the last word written on the subject.[59] In the same year, Hart Blanton et al.’s meta-analysis concluded that the IAT does “not permit prediction of individual-level behaviors.”[60] To be sure, their meta-analysis did not cover all studies. One study that they wanted to include was by Laurie Rudman and Peter Glick, which Jost et al. claimed “no manager should ignore.”[61] But the data for Rudman and Glick’s study had been “lost.”[62] In 2013, a larger meta-analysis appeared that concluded that IATs are “poor predictors” of discriminatory behavior.[63]

The above considerations are not meant to resolve debates about the IAT. Our aim here is to show that all such debate is, at present, unsettled. As a result, we cannot simply assume that the existence of implicit bias against women has been established by the IAT. Note, moreover, that the mere existence of implicit bias cannot be taken to explain the underrepresentation of women in philosophy as long as its effect has not been shown to be strong enough to compensate for the explicit resistance to bias that one generally finds among academics in the developed world. For example, Prinz notes that “[m]ost of us believe that all people deserve equality of opportunity regardless of sex,” and Saul observes that “explicit commitments to egalitarianism are widespread” among academics.[64] Indeed, given that most academics (especially in the humanities) are liberal or left-leaning,[65] one would expect them to go out of their way to eliminate even a semblance of bias against women. If unconscious bias were responsible for underrepresentation, its effect would have to be quite powerful. For the time being, we see little evidence to support this hypothesis.

Stereotype Threat

The phrase “stereotype threat” refers to a situation in which subjects tend to underperform on a given task because they are afraid of confirming a negative public stereotype about their group. A number of philosophers think that this phenomenon plays an important role in the underrepresentation of women in philosophy. They present stereotype threat as what “substantial research in psychology has shown”[66] as “a well established psychological phenomenon”[67]and as supported by “a well-established body of research in psychology.”[68] In a similar vein, they claim that “there is a considerable body of evidence” that people are subject to stereotype threat,[69] that “the effects of stereotype threat are dramatic,”[70] and that “there is good reason to suppose” that “stereotype threat play[s] a role in perpetuating the under-representation of women in the field.”[71] The Rutgers philosophy department’s website has a “Climate for Women and Underrepresented Groups at Rutgers” page that takes the existence of stereotype threats for granted and describes “some situations in which stereotype threat can be triggered.”[72]

Although no research about stereotype threat in philosophy has been done, the argument is that since the phenomenon has been empirically confirmed in other academic disciplines it is safe to assume that the same effects must be present in philosophy as well. But the basis for this extrapolation is more dubious than many believers in stereotype threat think. In reality, there is a lot of skepticism among psychologists about the significance of stereotype threat, including that it exists.

A number of studies were unable to replicate the stereotype threat effect, particularly in so-called high-stake situations, which are most relevant for potentially explaining real-life disparities between men and women in academia.[73] One of the scholars who got negative results for stereotype threat is John A. List, professor of economics at the University of Chicago, who comments:

So we designed the experiment to test that, and we found that we could not even induce stereotype threat. We did everything we could to try to get it. We announced to them, “Women do not perform as well as men on this test and we want you now to put your gender on the top of the test.” And other social scientists would say, that’s crazy — if you do that, you will get stereotype threat every time. But we still didn’t get it.[74]

So, what is going on here? Why do different empirical studies get contradictory results? List offers this explanation:

I think that stereotype threat has a lot of important boundaries that severely limit its generalizability. I think what has happened is, a few people found this result early on and now there’s publication bias. But when you talk behind the scenes to people in the profession, they have a hard time finding it. So what do they do in that case? A lot of people just shelve that experiment; they say it must be wrong because there are 10 papers in the literature that find it. Well, if there have been 200 studies that try to find it, 10 should find it, right?[75]

And indeed, the problem of publication bias (also known as “the file drawer problem”) is acutely present in this area of research. For example, in a recent study on the possible stereotype threat among young females, Colleen M. Ganley at al. warn that there is “serious concern” that the alleged effect might be an illusion based on publication bias. Ganley at al. point out that while published articles had a tendency to confirm the existence of stereotype threat, none of the three unpublished dissertations showed that effect.[76] They also complain that they were unable to perform a meta-analysis because the number of available empirical investigations was too small and because many of the studies did not provide information necessary for calculating effect sizes.[77]

Another article giving an overview of the literature on gender-based stereotype threat in mathematics raises further methodological concerns and recommends caution. Gijsbert Stoet and David C. Geary undertook to analyze and evaluate all attempts to date to replicate the results of the first and most widely cited study, by Steven J. Spencer, Claude M. Steele, and Diane M. Quinn, of the alleged stereotype threat affecting women’s math performance.[78] Stoet and Geary warn about several serious shortcomings endemic in this research: an incomplete description of results (no reports about means or standard deviations), significance values being relaxed when the data matched the hypothesis, a biased presentation whereby the significant measure was highlighted in the text and abstract while the nonsignificant one is relegated to a footnote, etc. Stoet and Geary single out for criticism those scholars who draw conclusions about stereotype threat in women from experiments that did not have the control group (men). They point out, correctly, that this kind of inference is as logically problematic as “if one would conclude that a study with people who all wear clothes says something unique about people wearing clothes.”[79]

Moreover, the introduction of the control group (men) is imperative in stereotype threat research not just because the research question is itself essentially comparative—Are women specially vulnerable to stereotype threat?—but also because striking and unanticipated empirical results might otherwise be completely missed. For instance, in one non-laboratory study conducted on a nationally representative sample it was only by looking at females and males that it was possible to see that the gender gap in math test performance actually shrinks when both girls and boys are exposed to the female-negative prime “math is for boys.”[80]

One possible explanation for this finding is that the “math is for boys” prime triggers a fear in male participants of disconfirming a positive stereotype about their group. After all, if fear of confirming a negative stereotype is a real and pervasive phenomenon, then why not also fear of disconfirming a positive stereotype? One’s personal reputation seems to be more at risk when a positive stereotype is salient: failing to live up to high expectations may be more embarrassing than simply confirming low expectations. In fact, there is empirical evidence that praise can actually undermine children’s motivation and even lead them to “sabotage future performance in order to resolve a discrepancy between the statement of praise and more realistic beliefs about the self.”[81] Nonetheless, the fear that positive stereotypes might induce is never mentioned by authors invoking stereotype threat as an explanation of underrepresentation.

To return to Stoet and Geary’s overview of the literature, their conclusion is not good news for stereotype threat hypothesis proponents. Stoet and Geary found only twenty studies that (a) addressed gender-related stereotype threat in adult math performance, and (b) had a similar research design to the seminal study on stereotype threat by Spencer, Steele, and Quinn. Only eleven of those twenty studies (55 percent) replicated the effect of stereotype threat (at the conventional .05 significance level). And after excluding also those studies that raise methodological worries because they selected as subjects the men and women who were known to have equal, previously measured math scores,[82] the number of studies was narrowed from twenty to ten. And of the remaining ten only three studies replicated the original results.

Given this outcome, it is not surprising that Stoet and Geary conclude that the stereotype threat hypothesis has not been confirmed, even if the clear danger of publication bias is disregarded: “Even when assuming that all failures to replicate have been reported, we can only conclude that evidence for the stereotype threat explanation of the gender difference in mathematics performance is weak at best.”[83]

Stoet and Geary actually go further and caution that the frequently overenthusiastic support of psychologists for an empirically dubious claim is likely to damage the public image of the whole discipline: “After all, the aforementioned types of flaws in scientific work will ultimately damage the reputation of the whole field of psychology as a science.”[84] Speaking about threats to psychology’s reputation, an article about stereotype threat published in one of the top journals in social psychology has recently been retracted after it was established, in a highly publicized scandal, that one of its co-authors, Diederik A. Stapel, was involved in collecting fraudulent data for some of his other publications.[85]

The evidence invoked in support of stereotype threat comes mainly from psychological lab experiments. Even if this evidence didn’t have all the abovementioned problems, it would by no means follow that the effects found in the lab could be immediately generalized to real-life situations. First, it is possible that the priming effects could be explained by the “experimenter demand effect,” according to which participants sometimes try to satisfy the experimenter’s expectations and behave accordingly.[86] Second, stereotypes might not be salient enough to trigger the hypothesized stereotype threat mechanisms.[87] And third, in high-stake situations like SAT and GRE exams or job interviews, the importance of the occasion and the increased motivation that comes with it might offset the effect of stereotype threat and make it disappear.[88]

Finally, even if it were shown that women suffered from stereotype threat in real-life situations, this by itself would not entail that underrepresentation of women in, say, philosophy would be to a substantial extent explainable by stereotype threat. For all we know, stereotype threat could be a comparatively minor effect fading in importance compared to other causes of the gender disparity. In fact, some of the leading students of the gender gap in science also regard stereotype threat as of minor importance.[89]

All in all, the scholarly research shows that since there are still too many unknowns about stereotype threat, it would be unwise to cite this phenomenon to explain gender disparity in philosophy, and especially unwise for making policy recommendations.

Is There Too Little Feminism in Philosophy?

In her widely-cited article discussed above, “Changing the Ideology and Culture of Philosophy: Not by Reason (Alone),” Haslanger writes:

It is appalling to me that there is so little feminist work published in the [seven leading philosophical] journals examined, even in journals focused on ethics and political philosophy….Given the numbers of women philosophers working on feminism, this is striking. Jennifer Saul has told me that she sees a pronounced difference in the responses she gets from journals to her work in philosophy of language compared to her feminist work. Her papers in philosophy of language are always sent out to referees; her feminist submissions, however, are routinely sent back without having been considered by a reviewer. What is going on here?[90]

There are all kinds of things that might be going on here. But given what we know, and especially what we do not know, it seems definitely premature to be “appalled” at the situation. Contrary to what Haslanger writes, the journal data are not a sign “that something is wrong.”[91] Without further information, the journal data Haslanger cites are not a sign of anything.

Similarly, the fact that a philosopher has different success rates in publishing her work in different philosophical areas is not in itself evidence of bias against the area in which she is less successful. With so many unknowns, any such inference would be unjustified.

Nevertheless, it is again merely on the basis of Haslanger’s journal data that it is claimed in the document “Women in Philosophy” on the website of the School of Philosophy at the Australian National University that “the incredibly low percentage [2.36 percent] of feminist philosophy articles published in leading philosophy journals” suggests that “feminist questions are side-lined in the philosophy discipline.”[92]

Is it really true that feminist questions are sidelined in philosophy? If yes, one would expect that the words “feminism” and “feminist” would rarely appear in philosophy catalogues of leading publishers, or in major philosophical encyclopedias. But a review of the titles published by Oxford University Press in recent years reveals that more volumes appeared in the Oxford Studies in Feminist Philosophy series than in three other series combined that represent traditional central areas of philosophy, namely, Oxford Studies in Metaphysics, Oxford Studies in Epistemology, and Oxford Studies in the Philosophy of Science. Similarly, the following chart compares the frequency of appearance of different terms ending in “ism” in the Routledge Encyclopedia of Philosophy:[93]

Table 1: “Isms” in Routledge Encyclopedia of Philosophy Frequency of Appearance Term Number of Times Used Feminism, feminist 1303 Empiricism, empiricist 1193 Materialism, materialist 1220 Idealism, idealist 1542 Rationalism, rationalist 972 Naturalism, naturalist 1185

These numbers indicate that, in terms of attention received in the philosophical community feminism is faring quite well, especially keeping in mind that it is relatively new to the philosophical scene.

How often do these various “isms” appear in titles of articles listed in the Stanford Encyclopedia of Philosophy?[94] Here are the data:

Table 2: “Isms” Found in Article Titles Stanford Encyclopedia of Philosophy Term Number of Times Used Feminism, feminist 36 Empiricism, empiricist 1 Materialism, materialist 1 Idealism, idealist 0 Rationalism, rationalist 1 Naturalism, naturalist 5

The number of SEP articles that contain “feminism” (or “feminist”) in the title is more than four times higher than the number of all articles combined whose titles mention empiricism, materialism, idealism, rationalism, or naturalism.

These numbers suggest that, far from sidelining feminism, philosophers make extra efforts to dedicate an inordinate amount of space to feminism, and precisely in those all-important publications that are focused on presenting the state of the art in philosophy to its practitioners as well as to the wider public.

Furthermore, even top philosophy journals have occasionally relaxed their standards to make place for feminist articles. Here is one example, a feminist approach to the logic of negation:

Such an account of ~p speciﬁes ~p in relation to p conceived as the controlling center, and so is p-centered...In the phallic drama of this p-centered account, there is really only one actor, p, and ~p is merely its receptacle. In the representation of the Venn diagram, p penetrates a passive, undifferentiated universal other which is speciﬁed as a lack, which offers no resistance, and whose behavior it controls completely.[95]

It seems that, from the perspective of feminist logic, p is guilty of behaving very badly toward ~p, controlling it, treating it as a mere receptacle, penetrating it in the phallic drama—all but ravishing it. Needless to say, these are risible claims that would, as a rule, immediately disqualify a paper from publication in other areas.

It is hard to reconcile these signs of a sympathetic disciplinary attitude toward feminism, which is manifested in many other ways, too, with the hypothesis of a massive sexist, antifeminist stance coming from basically the same people—only now in their role as journal editors and referees. Of course, the hypothesis might still be true, but to be taken seriously much more is needed than the bare assertion that the percentage of feminist articles on philosophy accepted is “incredibly low.”

There are many reasons why different areas of philosophy are represented in different proportions in the discipline’s top journals, the most common being that philosophers in the “under-published” subdisciplines simply have not managed to come up with ideas that would generate sufficient interest and excitement among other philosophers. As long as this kind of “internal” explanation is left unexplored and is not shown to be implausible, it seems inappropriate to attribute massive prejudice to our colleagues—especially if this charge is based solely on the claim that the actual share of journal space allotted to a certain area is much lower than an unspecified approximate quota to which it is supposedly entitled.

A quite radical proposal on this issue has been recently brought forward in a newsletter of the American Philosophical Association:

Why not a more proactive stance, one that ties adequacy of reviewing procedures to the value a journal places on feminist philosophy—which is one measure of the value it places on women.[96]

According to this proposal, then, any journal not publishing enough feminist philosophy would immediately fall under suspicion of having inadequate reviewing procedures and a sexist attitude.

But in fact a low opinion of feminism is not in itself a sign of sexism. And yet we can also read (again, in an American Philosophical Association newsletter):

If one is interested in eliminating sexism from the philosophical profession, one must take feminist philosophy seriously.[97]

By contraposition, this statement actually says that anyone who doesn’t take feminism seriously is not interested in eliminating sexism from philosophy. As long as “feminism” refers to a substantive philosophical claim (however interpreted) and does not degenerate into a theoretically uninteresting moral truism (e.g., “Don’t treat women unfairly!”), the statement above is manifestly false. You can strongly disagree with feminism (however construed), even to the point of thinking that it should not be taken seriously (as a philosophical claim), but your negative attitude toward feminism does not thereby entail that you condone sexism.

Directly linking antifeminism to sexism may give feminists a powerful rhetorical and political weapon against their opponents, but it will not earn them professional respect. It is also probably among the factors that contribute to fomenting a confrontational atmosphere in which feminists are not always able to take criticisms in stride.

To give just one example, in a recent interview Elizabeth Anderson, Arthur F. Thurnau Professor and John Dewey Distinguished University Professor of Philosophy and Women’s Studies at the University of Michigan and a leading contemporary feminist, had this to say about several very prominent philosophers who, in their own ways, have tried to explain in great detail why they strongly disagree with the feminist position:

“[It is striking] how detached they are from academic norms of rational discourse….There is a level of obtuseness and hysteria here that is quite shocking. It’s like trying to engage people who claim that Obama is a Muslim jihadist terrorist.[98]

Among the critics of feminism who are characterized in this manner are Noretta Koertge and Susan Haack, scholars held in very high esteem in philosophy. Needless to say, the suggestion that their opposition to feminism is similar to “the claim that Obama is a Muslim jihadist terrorist” is ludicrous.

Conclusion

In a recent court case Loretta Preska, Chief Judge of the U.S. District Court for the Southern District of New York, dismissed a complaint about sex discrimination by saying: “‘J’accuse!’ is not enough in court. Evidence is required.”[99] We should expect nothing less in philosophy.

We examined the main arguments for the claim that there is widespread discrimination against women in philosophy. We tried to show that these arguments, individually and collectively, fall far short of establishing that conclusion.