The replication crisis has seen increased focus on best practice techniques to improve the reliability of scientific findings. What remains elusive to many researchers and is frequently misunderstood is that predictions involving interactions dramatically affect the calculation of statistical power. Using recent papers published in Personality and Social Psychology Bulletin (PSPB), we illustrate the pitfalls of improper power estimations in studies where attenuated interactions are predicted. Our investigation shows why even a programmatic series of six studies employing 2 × 2 designs, with samples exceeding N = 500, can be woefully underpowered to detect genuine effects. We also highlight the importance of accounting for error-prone measures when estimating effect sizes and calculating power, explaining why even positive results can mislead when power is low. We then provide five guidelines for researchers to avoid these pitfalls, including cautioning against the heuristic that a series of underpowered studies approximates the credibility of one well-powered study.

References

Adelman, L., Dasgupta, N. ( 2019 ). Effect of threat and social identity on reactions to ingroup criticism: Defensiveness, openness, and a remedy . Personality and Social Psychology Bulletin, 45(5), 740 – 753 .

Google Scholar SAGE Journals | ISI

Aiken, L. S., West, S. G., Reno, R. R. ( 1991 ). Multiple regression: Testing and interpreting interactions [Aiken, Leona S., West, Stephen G. with contributions by Raymond R. Reno). SAGE .

Google Scholar

Amrhein, V., Greenland, S., McShane, B. ( 2019 ). Retire statistical significance . Nature 567(7748), 305 – 307 .

Google Scholar Crossref | Medline

Anderson, S. F., Maxwell, S. E. ( 2016 ). There’s more than one way to conduct a replication study: Beyond statistical significance . Psychological Methods, 21(1), 1 – 12 .

Google Scholar Crossref | Medline

Arslan, R. C., Schilling, K. M., Gerlach, T. M., Penke, L. ( 2018 ). Using 26,000 diary entries to show ovulatory changes in sexual desire and behavior . Journal of Personality and Social Psychology. Advance online publication. https://doi.org/10.1037/pspp0000208

Google Scholar

Bahamondes, J., Sibley, C. G., Osborne, D. ( 2019 ). “We look (and feel) better through system-justifying lenses”: System-justifying beliefs attenuate the well-being gap between the advantaged and disadvantaged by reducing perceptions of discrimination . Personality and Social Psychology Bulletin, 45(9), 1391 – 1408 .

Google Scholar SAGE Journals | ISI

Blake, K. R. ( 2018 ). Resolving speculations of methodological inadequacies in the standardized protocol for characterizing women’s fertility: Comment on Lobmaier and Bachofner (2018) . Hormones and Behavior, 106, A4 – A6 .

Google Scholar Crossref | Medline

Blake, K. R., Dixson, B. J. W., O’Dean, S. M., Denson, T. F. ( 2016 ). Standardized protocols for characterizing women’s fertility: A data-driven approach . Hormones and Behavior, 81, 74 – 83 .

Google Scholar Crossref | Medline | ISI

Busemeyer, J. R., Jones, L. E. ( 1983 ). Analysis of multiplicative combination rules when the causal variables are measured with error . Psychological Bulletin, 93(3), 549 – 562 .

Google Scholar Crossref | ISI

Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., Munafó, M. R. ( 2013 ). Power failure: Why small sample size undermines the reliability of neuroscience . Nature Reviews Neuroscience, 14(5), 365 – 376 .

Google Scholar Crossref | Medline | ISI

Carrier, A., Dompnier, B., Yzerbyt, V. ( 2019 ). Of nice and mean: The personal relevance of others’ competence drives perceptions of warmth . Personality and Social Psychology Bulletin, 45(11), 1549 – 1562 .

Google Scholar SAGE Journals | ISI

Christley, R. ( 2010 ). Power and error: Increased risk of false positive results when power is low . Open Epidemiology Journal, 3, 16 – 19 .

Google Scholar Crossref

Cohen, J. ( 1988 ). Statistical power analysis for the behavioral sciences (Revised ed.). Academic Press .

Google Scholar

Durante, K. M., Griskevicius, V., Cantú, S. M., Simpson, J. A. ( 2014 ). Money, status, and the ovulatory cycle . Journal of Marketing Research, 51, 27 – 39 .

Google Scholar SAGE Journals | ISI

Eck, J., Schoel, C., Reinhard, M. A., Greifeneder, R. ( 2019 ). When and why being ostracized affects veracity judgments . Personality and Social Psychology Bulletin, 46, 454 – 468 .

Google Scholar SAGE Journals

Faul, F., Erdfelder, E., Buchner, A., Lang, A.-G. ( 2009 ). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses . Behavior Research Methods, 41, 1149 – 1160 .

Google Scholar Crossref | Medline | ISI

Fehring, R. J., Schneider, M. ( 2008 ). Variability in the hormonally estimated fertile phase of the menstrual cycle . Fertility and Sterility, 90(4), 1232 – 1235 .

Google Scholar Crossref | Medline

Fiedler, K., Kutzner, F., Krueger, J. I. ( 2012 ). The long way from α-error control to validity proper: Problems with a short-sighted false-positive debate . Perspectives on Psychological Science, 7(6), 661 – 669 .

Google Scholar SAGE Journals | ISI

Francis, G. ( 2013 ). Publication bias in “red, rank, and romance in women viewing men,” by Elliot et al . Journal of Experimental Psychology: General, 142, 292 – 296 .

Google Scholar Crossref | Medline | ISI

Funder, D. C., Levine, J. M., Mackie, D. M., Morf, C. C., Sansone, C., Vazire, S., West, S. G. ( 2014 ). Improving the dependability of research in personality and social psychology: Recommendations for research and educational practice . Personality and Social Psychology Review, 18(1), 3 – 12 .

Google Scholar SAGE Journals | ISI

Gangestad, S. W., Haselton, M. G., Welling, L. L., Gildersleeve, K., Pillsworth, E. G., Burriss, R. P., … Puts, D. A. ( 2016 ). How valid are assessments of conception probability in ovulatory cycle research? Evaluations, recommendations, and theoretical implications . Evolution and Human Behavior, 37(2), 85 – 96 . https://doi.org/10.1016/j.evolhumbehav.2015.09.001

Google Scholar Crossref

Guermandi, E., Vegetti, W., Bianchi, M. M., Uglietti, A., Ragni, G., Crosignani, P. ( 2001 ). Reliability of ovulation tests in infertile women . Obstetrics and Gynecology, 97(1), 92 – 96 .

Google Scholar Medline

Guida, M., Tommaselli, G. A., Palomba, S., Pellicano, M., Moccia, G., Di Carlo, C., Nappi, C. ( 1999 ). Efficacy of methods for determining ovulation in a natural family planning program . Fertility and Sterility, 72(5), 900 – 904 .

Google Scholar Crossref | Medline

Hasan-Aslih, S., Pliskin, R., van Zomeren, M., Halperin, E., Saguy, T. ( 2019 ). A darker side of hope: Harmony-focused hope decreases collective action intentions among the disadvantaged . Personality and Social Psychology Bulletin, 45(2), 209 – 223 .

Google Scholar SAGE Journals | ISI

Hayes, A. F. ( 2012 ). PROCESS: A versatile computational tool for observed variable mediation, moderation, and conditional process modeling [White paper]. https://pdfs.semanticscholar.org/e9bb/7b23993113a73ee1ff6cde5ff9a4164f946e.pdf?_ga=2.138693107.1964489091.1583384140-607929785.1576734677

Google Scholar

Jeffreys, H. ( 1939 ). Theory of probability. Clarendon Press .

Google Scholar

Lakens, D. ( 2017 ). Equivalence tests: A practical primer for t tests, correlations, and meta-analyses . Social Psychological and Personality Science, 8(4), 355 – 362 .

Google Scholar SAGE Journals | ISI

Lakens, D., McLatchie, N., Isager, P., Scheel, A., Dienes, Z. ( 2020 ). Improving inferences about null effects with Bayes factors and equivalence tests . The Journals of Gerontology, Series B: Psychological Sciences & Social Sciences, 75, 45 – 57 .

Google Scholar Crossref | Medline

Lehmann, G. K., Calin-Jageman, R. J. ( 2017 ). Is red really romantic? Two pre-registered replications of the red-romance hypothesis . Social Psychology, 48, 174 – 183 .

Google Scholar Crossref

Mackinnon, D. P. ( 2011 ). Integrating mediators and moderators in research design . Research on Social Work Practice, 21(6), 675 – 681 .

Google Scholar SAGE Journals | ISI

Martin, A. E., North, M. S., Phillips, K. W. ( 2019 ). Intersectional escape: Older women elude agentic prescriptions more than older men . Personality and Social Psychology Bulletin, 45(3), 342 – 359 .

Google Scholar SAGE Journals | ISI

Maxwell, S. E. ( 2004 ). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies . Psychological Methods, 9(2), 147 – 163 .

Google Scholar Crossref | Medline | ISI

Maxwell, S. E., Lau, M. Y., Howard, G. S. ( 2015 ). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? The American Psychologist, 70(6), 487 – 498 .

Google Scholar Crossref | Medline

McClelland, G. H., Judd, C. M. ( 1993 ). Statistical difficulties of detecting interactions and moderator effects . Psychological Bulletin, 114(2), 376 – 390 .

Google Scholar Crossref | Medline | ISI

McShane, B. B., Böckenholt, U., Hansen, K. T. ( 2016 ). Adjusting for publication bias in meta-analysis: An evaluation of selection methods and some cautionary notes . Perspectives of Psychological Science, 11, 730 – 749 .

Google Scholar SAGE Journals | ISI

Nelson-Coffey, S. K., Killingsworth, M., Layous, K., Cole, S. W., Lyubomirsky, S. ( 2019 ). Parenthood is associated with greater well-being for fathers than mothers . Personality and Social Psychology Bulletin, 45(9), 1378 – 1390 .

Google Scholar SAGE Journals | ISI

Netchaeva, E., Kouchaki, M. ( 2018 ). The woman in red: Examining the effect of ovulatory cycle on women’s perceptions of and behaviors toward other women . Personality & Social Psychology Bulletin, 44(8), 1180 – 1200 .

Google Scholar SAGE Journals | ISI

Porterfield, S. P. ( 2001 ). The Mosby physiology monograph series. Endocrine physiology ( 2nd ed. ). Mosby .

Google Scholar

Rogers, J. L., Howard, K. I., Vessey, J. T. ( 1993 ). Using significance tests to evaluate equivalence between two experimental groups . Psychological Bulletin, 113(3), 553 – 565 .

Google Scholar Crossref | Medline | ISI

Roney, J. R. ( 2018 ). Hormonal mechanisms and the optimal use of luteinizing hormone tests in human menstrual cycle research . Hormones and Behavior, 106, A7 – A9 .

Google Scholar Crossref | Medline

Scheibehenne, B., Jamil, T., Wagenmakers, E.-J. ( 2016 ). Bayesian evidence synthesis can reconcile seemingly inconsistent results: The case of hotel towel reuse . Psychological Science, 27(7), 1043 – 1046 .

Google Scholar SAGE Journals | ISI

Schimmack, U. ( 2012 ). The ironic effect of significant results on the credibility of multiple-study articles . Psychological Methods, 17(4), 551 – 566 .

Google Scholar Crossref | Medline | ISI

Schuirmann, D. J. ( 1987 ). A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability . Journal of Pharmacokinetics and Biopharmaceutics, 15(6), 657 – 680 .

Google Scholar Crossref | Medline

Simonsohn, U. ( 2014 , March 12 ). No-way interactions [Data colada]. http://doi.org/10.15200/winn.142559.90552

Google Scholar

Small, C. M., Manatunga, A. K., Marcus, M. ( 2007 ). Validity of self-reported menstrual cycle length . Annals of Epidemiology, 17(3), 163 – 170 .

Google Scholar Crossref | Medline

Sparks, J., Ledgerwood, A. ( 2019 ). Age attenuates the negativity bias in reframing effects . Personality and Social Psychology Bulletin, 45(7), 1042 – 1056 .

Google Scholar SAGE Journals | ISI

Townsend, S. S., Stephens, N. M., Smallets, S., Hamedani, M. G. ( 2019 ). Empowerment through difference: An online difference-education intervention closes the social class achievement gap . Personality and Social Psychology Bulletin, 45(7), 1068 – 1083 .

Google Scholar SAGE Journals | ISI

Vadillo, M. A., Konstantinidis, E., Shanks, D. R. ( 2016 ). Underpowered samples, false negatives, and unconscious learning . Psychonomic Bulletin & Review, 23(1), 87 – 102 .

Google Scholar Crossref | Medline

Voelkel, J. G., Brandt, M. J. ( 2019 ). The effect of ideological identification on the endorsement of moral values depends on the target group . Personality and Social Psychology Bulletin, 45(6), 851 – 863 .

Google Scholar SAGE Journals | ISI

Wang, I. M., Ackerman, J. M. ( 2019 ). The infectiousness of crowds: Crowding experiences are amplified by pathogen threats . Personality and Social Psychology Bulletin, 45(1), 120 – 132 .

Google Scholar SAGE Journals | ISI

Westlake, W. J. ( 1972 ). Use of confidence intervals in analysis of comparative bioavailability trials . Journal of Pharmaceutical Sciences, 61(8), 1340 – 1341 .

Google Scholar Crossref | Medline

Wilcox, A. J., Dunson, D. B., Weinberg, C. R., Trussell, J., Baird, D. D. ( 2001 ). Likelihood of conception with a single act of intercourse: Providing benchmark rates for assessment of post-coital contraceptives . Contraception, 63(4), 211 – 215 .

Google Scholar Crossref | Medline | ISI