In the 1830s, Samuel Morton considered skull size crucial in establishing human origins (Image: Daniel Lai/Aurora Photos)

Stephen Jay Gould claimed unconscious bias could affect even seemingly objective scientific measurements. Not so

TRUTH is hard to come by, as personal lives and politics readily illustrate. Since science lays claim to providing some form of truth, it is bound to draw criticism on that count. Surprisingly, one of the sharpest attacks came from within, and from one of the giants, Harvard University’s Stephen Jay Gould.

Gould was a man of many parts – invertebrate palaeontologist, evolutionary theorist, historian of science, crusader against creationism and a prolific populariser of science with a slew of bestselling books. He was an iconic scientist of the late 20th century, a stature confirmed by that arbiter of cultural relevance, The Simpsons, in which he was a featured guest star in one episode.


Even so, Gould harboured grave doubts about the ability of science to remain free from social pressures and bias. He made a series of statements in a 1978 Science paper that are startling given his role as a spokesperson for science: “…unconscious or dimly perceived finagling, doctoring, and massaging are rampant, endemic, and unavoidable in a profession [science] that awards status and power for clean and unambiguous discovery”; “unconscious manipulation of data may be a scientific norm”; “scientists are human beings rooted in cultural contexts, not automatons directed toward external truth”. This was blasphemy from the pulpit.

To support these claims, Gould presented the case of Samuel George Morton, a 19th-century American physician and scientist famous for his measurements of human skulls, particularly their cranial capacity (the skeletal equivalent of brain size). Morton’s aim in measuring skulls from a diverse range of human groups was not to get at intelligence differences, as is sometimes claimed. Instead, Morton hoped to determine whether different human populations were one species or many, and thus whether the divine creation had been singular or a play in several acts.

Gould reanalysed Morton’s data, and famously argued in Science and in his prize-winning bestseller The Mismeasure of Man, that Morton had manipulated his samples, made analytical errors, and mismeasured cranial capacities as a consequence of a racist bias. Gould concluded that Morton’s fudging was most likely unconscious, since Morton made no apparent attempt to cover his tracks, publishing all his raw data. Gould’s study quickly became famous, and his view that science was inevitably biased became the consensus view in science studies. Gould’s claims about Morton were rarely challenged, and never effectively.

There matters stood until one of us, Jason E. Lewis, decided to do what Gould had not – remeasure the skulls in Morton’s collection, kept at the University of Pennsylvania, Philadelphia. Several colleagues advised and assisted us, particularly Marc R. Meyer, now at Chaffey College, Rancho Cucamonga, California. We remeasured 308 of the 670 skulls originally in Morton’s collection – some had been reburied. Find our data at bit.ly/pmZlLL.

Next, we scrutinised every claim Gould made, redid every analysis Gould did, and then did the same for Morton. If Gould was right and Morton had mismeasured some skulls because of racial bias, we would expect the mismeasured crania to be non-randomly distributed by population: Morton’s overestimates would be concentrated on the crania of “white” people, while his underestimates would be mostly “non-white”. But we found Morton had only overmeasured “non-white” crania. Statistically, his errors were generally distributed randomly by population, with the exception of a possible tendency to overmeasure Egyptian crania. Morton considered Egyptian skulls “Negro”, so his errors did not fit with his predicted bias.

Overall, we found no evidence that Morton’s bias had affected his results. Gould, in contrast, made a number of clear errors, all connected with his own presumed bias towards there being a lack of differences between populations. For example, Gould published an erroneously low Caucasian average cranial capacity, and an erroneously high Native American average, due to mistakes in how he used Morton’s data.

With our collaborators, we published our results in the June issue of PLoS Biology, where we were careful to point out that, far from refuting Gould’s well-known attacks on racism, we share his concerns. Gould’s opposition to the pseudo-science of racism was not based on his analysis of Morton. Furthermore, the generally small cranial capacity differences within humans do not correlate with intelligence or much else other than hat size. Even so, publishing that Gould has erred will inevitably bring joy to racists, much as the unmasking of the Piltdown man fossil as a hoax is a continuing source of glee to creationists, despite the abundant evidence falsifying both creationism and the pseudo-science of racism.

Publishing that Gould erred will inevitably bring joy to racists

For us to abdicate from our responsibility to the self-correcting nature of science for fear of what some may make of the errors would be a tragic mistake, because it would transform science into the kind of advocacy-driven operation some mistakenly accuse it of being.

Some commentators have observed that Gould may have proven his own point about bias being inevitable in science, but with the bias merely being his own rather than Morton’s. We note this possibility in our paper, but there is an important difference between what Gould was claiming of Morton and what Gould seems to have done himself.

With Morton, Gould claimed unconscious bias could impact even the most seemingly objective and straightforward part of science: basic measurements and averages. In contrast, Gould did not generate any measurements. Instead he manipulated Morton’s data and summarised his work. That bias can influence the sort of secondary work Gould did is a different and a much less surprising claim than unconscious bias influencing actual measurements. Further, Gould’s claim was about unconscious bias, rather than deliberate misconduct, the latter being a known hazard in science, though of uncertain frequency. So into which category do Gould’s errors fall? We find it hard to see how to test hypotheses about his motivations, especially since those possible motivations are far from binary: there is a continuum between unconscious error and deliberate fraud. Ultimately, we consider the question of whether Gould’s errors were deliberate, unconscious, or in-between, to be open to debate.

Morton, though, had clear racial biases, which are stated explicitly in his books. And Gould was certainly right that all scientists, as humans, have some sort of bias. But while biased scientists are inevitable, biased results are not, as illustrated by Morton (biased) and his data (unbiased, as far as we can tell). Science does not depend on unbiased investigators but on methods which limit the ability of the investigator’s bias to influence the results. Morton’s methods appear to have been sufficient in this regard.

Along with our reanalysis of the Morton case, we would counter Gould’s claim of inevitably biased results with this thought experiment. A geneticist in his 20s with a family history of Huntington’s disease – a fatal condition with a simple genetic basis and symptoms which typically manifest in middle age – decides to test himself for the disease gene. A greater bias regarding the outcome is unimaginable: it is literally life and death. Yet the method is so robust that even such an extreme bias cannot, short of fraud, influence the results.

Truth is hard, but it is sometimes obtainable despite even our strongest biases. What a marvellous thing, this science.