In 1988, Fritz Strack and colleagues published one of the most wonderful studies in psychology. They asked volunteers how funny they thought some cartoons were. While looking at the cartoons, some of the participants held a pen between their teeth without it touching their lips, while some others held a pen in their lips without allowing it to touch their teeth. (The participants believed they were testing out methods disabled people could use to write.)

If you try this in front of a mirror, you’ll see that when you hold a pen in your lips you look vaguely as though you’re frowning; when you hold it with your teeth you’re grinning.

The volunteers with a pen between their lips thought the cartoons were less funny than did volunteers holding a pen in their hands. Those with a pen between their teeth thought they were funniest. Simply contracting the same muscles as when we’re amused or not changes how we perceive things.

This is an odd result, but one consistent with other experiments that show we infer how we’re feeling and what we’re thinking in the way that others would: from what we say and do and from our expressions.

Because the result was such a surprise, because the experiment was so easy to explain, and because it was so much fun, it was widely publicised. It cropped up in books, TV series and newspaper articles.

Then came the replication project that laid waste to much of what we thought we had learned in social psychology. Famous result after famous result went up in smoke when other researchers tried to repeat them.

It was Strack’s turn in 2016. The smiling experiment was put to the test in 17 labs around the world. 2,000 volunteers held pens in their lips and between their teeth. Researchers took notes, videoed the subjects and sent their results back to HQ for analysis. It was a complete flop. There was no 0.8 difference on a 0-9 rating scale, as Strack had originally found. There wasn’t even a 0.1 difference.

The replication attempt stoked a blistering row among researchers. Some argued that social psychology was fundamentally rotten. Some argued that replication was a “career niche for bad experimenters” who lacked “whatever skills and talents are needed” to get the right result.

Others wondered whether the replication was quite perfect. Sure, the control and the manipulation were the same. But perhaps the cartoons had dated a bit; maybe our reaction to amusing pictures has altered in an age of internet memes. Was the original experiment so well known that it was impossible to find people who weren’t on some level aware of it? Strack himself wondered whether the video camera – placed in the room to increase the rigour of the replication – was affecting the result.

To rule this out, Israeli researchers re-ran the experiment, both with and without a video camera in place. And what do you know – Strack was right. When participants knew that they were being filmed, it seems they had some sensitivity to appearance. Perhaps their overwhelming feeling on holding a pen between their teeth was of foolishness rather than amusement, knowing that their inept attempts to write with a pen in their mouth was part of the academic record for all time. At any rate, they didn’t rate the cartoons as funnier.

But scrap the camera, and Strack’s original result was restored – even down to the size of the effect.

Social psychology is such a mesmerising field because it shows that relatively small manipulations have profound effects on our behaviour. It’s a fascinating subject to follow at the moment because many of these quirks and squiggles haven’t been found yet: social psychology is not a mature science.

Being a social psychologist is like being a sub-atomic physicist in the 20th century or an anthropologist in the 18th. It’s also a hard science because its raison d’être is finding effects in our own behaviour that we don’t know about, and that we don’t believe affect what we do.

A consequence of the newness and difficulty of psychology is that scientists can find interesting things without knowing quite what they’re finding. Imagine an early chemist. He knows that heat generally speeds up reactions, so he bungs his chemicals in two test tubes. One he leaves at room temperature, the other he whacks on his Bunsen burner.

Astonishingly, he finds that the reaction occurs fastest in the first tube. He publishes to great acclaim and starts practising his Nobel acceptance speech. He’s then devastated that his colleagues don’t get the same result: they heat their test tube in a water bath and find that the reaction proceeds at the same speed as that at room temperature.

After much heartache and many academic recriminations, a third group look at the reaction again. They find that the reaction isn’t affected by heat, but is sped up by greater light intensity. The smoke from the Bunsen burner must have been dirtying the test tube and thereby slowing the reaction. What they’ve found is a light-sensitive reaction.

The equivalent is what’s just happened. Smiling does cause us to think we’re more amused: provided that we’re not being videoed. And there are probably a hundred other things that can increase the size of the effect, diminish it or possibly even reverse it, just waiting to be uncovered.

There’s no doubt that there’s fraud in social psychology to a greater or lesser extent than any other competitive endeavour. There’s surely statistical malfeasance and misunderstanding. Certainly, there are many sloppily controlled experiments and write-ups that don’t contain important information. The replication project’s essential: it’s hard enough to make sense of ourselves without trying to fit our theories to dodgy, non-replicable effects.

But it’s heartwarming that in this instance, a failed replication has ultimately increased our understanding of ourselves rather than force us to acknowledge we know less about ourselves than we thought. And one thing’s for sure, Strack must be smiling at the moment.