In offering his colleague a cup of tea, Ronald Fisher was just being polite. He had no intention of kicking up a dispute—much less remaking modern science.

At the time, the early 1920s, Fisher worked at an agricultural research station north of London. A short, slight mathematician with rounded spectacles, he’d been hired to help scientists there design better experiments, but he wasn’t making much headway. The station’s four o’clock tea breaks were a nice distraction.

One afternoon Fisher fixed a cup for an algae biologist named Muriel Bristol. He knew she took milk with tea, so he poured some milk into a cup and added the tea to it.

That’s when the trouble started. Bristol refused the cup. “I won’t drink that,” she declared.

Fisher was taken aback. “Why?”

“Because you poured the milk into the cup first,” she said. She explained that she never drank tea unless the milk went in second.

The milk-first/tea-first debate has been a bone of contention in England ever since tea arrived there in the mid-1600s. It might sound like the ultimate petty butter battle, but each side has its partisans, who get boiling mad if someone makes a cup the “wrong” way. One newspaper in London declared not long ago, “If anything is going to kick off another civil war in the U.K., it is probably going to be this.”

As a man of science Fisher thought the debate was nonsense. Thermodynamically, mixing A with B was the same as mixing B with A, since the final temperature and relative proportions would be identical. “Surely,” Fisher reasoned with Bristol, “the order doesn’t matter.”

“It does,” she insisted. She even claimed she could taste the difference between tea brewed each way.

Fisher scoffed. “That’s impossible.”

fisher_as_young_man.jpg Ronald Fisher in his youth, undated. Barr Smith Library, University of Adelaide

This might have gone on for some time if a third person, chemist William Roach, hadn’t piped up. Roach was actually in love with Bristol (he eventually married her) and no doubt wanted to defend her from Fisher. But as a scientist himself, Roach couldn’t just declare she was right. He’d need evidence. So he came up with a plan.

“Let’s run a test,” he said. “We’ll make some tea each way and see if she can taste which cup is which.”

Bristol declared she was game. Fisher was also enthusiastic. But given his background designing experiments he wanted the test to be precise. He proposed making eight cups of tea, four milk-first and four tea-first. They’d present them to Bristol in random order and let her guess.

Bristol agreed to this, so Roach and Fisher disappeared to make the tea. A few minutes later they returned, by which point a small audience had gathered to watch.

The order in which the cups were presented is lost to history. But no one would ever forget the outcome of the experiment. Bristol sipped the first cup and smacked her lips. Then she made her judgment. Perhaps she said, “Tea first.”

They handed her a second cup. She sipped again. “Milk first.”

This happened six more times. Tea first, milk first, milk first again. By the eighth cup Fisher was goggle-eyed behind his spectacles. Bristol had gotten every single one correct.

It turns out adding tea to milk is not the same as adding milk to tea, for chemical reasons. No one knew it at the time, but the fats and proteins in milk—which are hydrophobic, or water hating—can curl up and form little globules when milk mixes with water. In particular, when you pour milk into boiling hot tea, the first drops of milk that splash down get divided and isolated.

Surrounded by hot liquid, these isolated globules get scalded, and the whey proteins inside them—which unravel at around 160ºF—change shape and acquire a burnt-caramel flavor. (Ultra-high-temperature pasteurized milk, which is common in Europe, tastes funny to many Americans for a similar reason.) In contrast, pouring tea into milk prevents the isolation of globules, which minimizes scalding and the production of off-flavors.

As for whether milk-first or tea-first tastes better, that depends on your palate. But Bristol’s perception was correct. The chemistry of whey dictates that each one tastes distinct.

Bristol’s triumph was a bit humiliating for Fisher—who had been proven wrong in the most public way possible. But the important part of the experiment is what happened next. Perhaps a little petulant, Fisher wondered whether Bristol had simply gotten lucky and guessed correctly all eight times. He worked out the math for this possibility and realized the odds were 1 in 70. So she probably could taste the difference.

muriel_bristol-roach.jpg Muriel Bristol Roach, undated. Lawes Agricultural Trust

But even then, he couldn’t stop thinking about the experiment. What if she’d made a mistake at some point? What if she’d switched two cups around, incorrectly identifying a tea-first cup as a milk-first cup and vice versa? He reran the numbers and found the odds of her guessing correctly in that case dropped from 1 in 70 to around 1 in 4. In other words, accurately identifying six of eight cups meant she could probably taste the difference, but he’d be much less confident in her ability—and he could quantify exactly how much less confident.

Furthermore, that lack of confidence told Fisher something: the sample size was too small. So he began running more numbers and found that 12 cups of tea, with 6 poured each way, would have been a better trial. An individual cup would carry less weight, so one data point wouldn’t skew things so much. Other variations of the experiment occurred to him as well (for example, using random numbers of tea-first and milk-first cups), and he explored these possibilities over the next few months.

Now this might all sound like a waste of time. After all, Fisher’s boss wasn’t paying him to dink around in the tearoom. But the more Fisher thought about it, the more the tea test seemed pertinent. In the early 1920s there was no standard way to conduct scientific experiments: controls were rare, and most scientists analyzed data crudely. Fisher had been hired to design better experiments, and he realized the tea test pointed the way. However frivolous it seemed, its simplicity clarified his thinking and allowed him to isolate the key points of good experimental design and good statistical analysis. He could then apply what he’d learned in this simple case to messy real-world examples—say, isolating the effects of fertilizer on crop production.

Fisher published the fruit of his research in two seminal books, Statistical Methods for Research Workers and The Design of Experiments. The latter introduced several fundamental ideas, including the null hypothesis and statistical significance, that scientists worldwide still use today. And the first example Fisher used in his book—to set the tone for everything that followed—was Muriel Bristol’s tea test.

His intellectual acumen, however, did not insulate Fisher from the prejudices of his time when it came to class, race, and colonialism. Fisher was a well-known eugenicist and was steadfast in those beliefs throughout his life. When, in the aftermath of World War II, UNESCO formed a coalition of scientists to wrestle with Nazi science and provide the scientific backbone for the universal condemnation of racism, Fisher was among those who officially objected to what he saw as the project’s “well-intentioned” but misguided mission, affirming his belief that groups differed “in their innate capacity for intellectual and emotional development.”

But such convictions have done little to tarnish Fisher’s legacy. He became a legend in biology for helping to unite the gene theory of Gregor Mendel with the evolutionary theory of Charles Darwin. But his biggest contribution to science remains his work on experimental design. The reforms he introduced are so ubiquitous that they’re all but invisible nowadays—the sign of a true revolution.

CORRECTION: This article was updated to correct the probability calculations.