A decade ago a paper titled ‘God is Watching You’ by Shariff & Norenzayan (2007) received a lot of attention after it reported an experimental study showing that people subconsciously primed with religious concepts, regardless of their stated level of religiosity, behaved more generously. (The study reported a similar increase when secular ‘moral’ institutions were primed but this result received a lot less attention.) The methodology of the study involved participants rearranging scrambled sentences and then playing a one-shot ‘dictator’ game in which they unilaterally designate how to split a payment between themselves and another anonymous participant. In the religious condition the unscrambled sentences arranged into statements like “she felt the spirit” and made references to concepts like God, divine and prophet and in the secular condition references were made to concepts like police, justice and court. The neutral condition had the same task but no particular theme in the unscrambled sentences. The outcome was that in both the religious and secular condition people chose to allocate more money to the other participant, as compared with the neutral condition.

This relatively simply study received widespread media coverage and became something of a paradigmatic study for modern psychology, with the original paper currently receiving over 800 citations. A decade on and despite its undeniable influence, the paper and its findings are now the subject of considerable contention and debate. The reevaluations of the paper are related to a wider trend in psychology wherein priming studies are being treated to renewed scrutiny, in the wake of the ‘replication crisis’- an ongoing critical re-examination of psychology based on failure to replicate ‘established’ findings.

That Shariff & Norenzayan’s paper would be selected by independent researchers for a pre-registered replication attempt is thus not that surprising. The results of the replication published by Gomes and McCullough in 2015 were negative, finding no evidence for increased generosity following religious priming. What makes these results potentially more compelling than the original paper is that there are some important methodological improvements made: First, the replication used a much larger sample than the original study, N=650 vs N=100, which it made it better able to determine whether any differences between groups were genuine or due to random variance. Second, it was pre-registered, meaning that the hypotheses, methods and analysis were recorded publically in advance. Pre-registration is designed to both make research more transparent and to limit researchers from fiddling around with their data post-hoc to achieve results they desire (unconsciously or consciously). Third, it used more appropriate statistical analysis methods, the details of which are not important here but relate to the type of distribution found in the responses. Yet, despite such improvements and the contrasting finding, Gomes & McCullough’s paper has so far only been cited 22 times. That’s actually good for a two year old article but when compared with the 824 citations of the original paper it is illustrative of how the current academic publication model disincentives careful replication efforts.

Shariff & Norenzayan responded by suggesting that the failed replication might have been due to a methodological issue, in this case an unusually high level of generosity in Gomes & McCullough’s control condition (see the table below). The original authors of any failed replication will often point to these kind of methodological issues as the explanation for the result, which is understandable, but such accounts do often seem a convenient way to dismiss inconvenient results. The table below provided by Shariff & Norenzayan, for example, at first glance does seem to indicate that Gomes & McCullough result is an outlier from the existing literature (the dark bar indicating generosity in the control group is much higher) but note that two of the graphs are derived from the original paper, and two are unpublished. That means that only one other independent published study is referenced to provide the baseline. Additionally, the graph doesn’t take into consideration the discrepancy in sample sizes or that Gomes & McCullough’s study is the only one to be pre-registered. This is not to argue that all the other data should be ignored, just that the results from a pre-registered study with a much large sample should be accorded stronger credence than non-pre-registered studies that used smaller samples.

To help resolve the issue Shariff, Norenzayan and some other collaborators conducted a thorough meta-analysis of 93 studies that employed a religious priming methodology investigating results obtained from a total of 11,653 participants. Using an array of meta-analytical techniques, including those that attempt to detect and adjust for publication bias, they found overall support that “religious priming has robust effects on a variety of outcome measures” but added the important caveat that religious priming does not “reliably affect non-religious participants”. This paper is well argued and thorough and for many might seem to have quite conclusively resolved the dispute. However, the same year another paper led by Michiel van Elk re-analysed the religious priming data, including some studies that were excluded by Shariff et al. (2015), and used two alternative meta-analytical techniques that they argued were more appropriate. The results were mixed: one of the methods indicated that the effects were likely “driven by publication bias” and the other found evidence for a real effect. Van Elk and his colleagues (2015) thus contended that there was no “conclusive resolution” that could be derived from meta-analyses of existing studies and that “a large scale, preregistered replication project” was needed to resolve the debate.

If all of the above sounds messy, that’s because it is. Science is messy and progress on topics is made only incrementally and often involves heated debates. This is not the image that the media give, instead they focus on each individual study as if it is definitive but this is rarely, if ever, the case even when it comes to large meta-analyses. In the case of the religious priming literature it is also nice to see a commendable amount of civility despite the disagreements. Van Elk & his collaborators, for instance, were critical of the meta-analysis of Shariff et al. (2015) but they commended them for conducting them and taking issues of replicability seriously. Similarly, while Shariff & Norenzayan were sceptical of Gomes & McCullough’s findings, they praised them for undertaking the replication and collecting a larger sample size. Further, all sides broadly agree that further evidence is needed and that pre-registered multi-site replication efforts are part of that. I initially started to write this post to cover one such preregistered paper that was conducted recently with a Japanese sample, but I’ll get to that in the next post.