A valid comparison of the magnitude of two correlations requires researchers to directly contrast the correlations using an appropriate statistical test. In many popular statistics packages, however, tests for the significance of the difference between correlations are missing. To close this gap, we introduce cocor, a free software package for the R programming language. The cocor package covers a broad range of tests including the comparisons of independent and dependent correlations with either overlapping or nonoverlapping variables. The package also includes an implementation of Zou’s confidence interval for all of these comparisons. The platform independent cocor package enhances the R statistical computing environment and is available for scripting. Two different graphical user interfaces—a plugin for RKWard and a web interface—make cocor a convenient and user-friendly tool.

Introduction

Determining the relationship between two variables is at the heart of many research endeavours. In the social sciences, the most popular statistical method to quantify the magnitude of an association between two numeric variables is the Pearson product-moment correlation. It indicates the strength of a linear relationship between two variables, which may be either positive, negative, or zero. In many research contexts, it is necessary to compare the magnitude of two such correlations, for example, if a researcher wants to know whether an association changed after a treatment, or whether it differs between two groups of interest. When comparing correlations, a test of significance is necessary to control for the possibility of an observed difference occurring simply by chance. However, many introductory statistics textbooks [1–5] do not even mention significance tests for correlations. Also in research practice, the necessity of conducting a proper statistical test when comparing the magnitude of correlations is often ignored. For example, in neuroscientific investigations, correlations between behavioral measures and brain areas are often determined to identify the brain area that is most strongly involved in a given task. Rousselet and Pernet [6] criticized that such studies rarely provide quantitative tests of the difference between correlations. Instead, many authors fall prey to a statistical fallacy, and wrongly consider the existence of a significant and a nonsignificant correlation as providing sufficient evidence for a significant difference between these two correlations. Nieuwenhuis, Forstmann, and Wagenmakers [7] also found that, when making a comparison between correlations, researchers frequently interpreted a significant correlation in one condition and a nonsignificant correlation in another condition as providing evidence for different correlations in the two conditions. Such an interpretation, however, is fallacious. As pointed out by Rosnow and Rosenthal [8], “God loves the .06 nearly as much as the .05”. To make a valid, meaningful, and interpretable comparison between two correlations, it is necessary to directly contrast the two correlations under investigation using an appropriate statistical test [7].

Even when recognizing the importance of a formal statistical test of the difference between correlations, the researcher has many different significance tests to choose from, and the choice of the correct method is vital. Before picking a test, researchers have to distinguish between the following three cases: (1) The correlations were measured in two independent groups A and B. This case applies, for example, if a researcher wants to compare the correlations between anxiety and extraversion in two different groups A and B (ρ A = ρ B ). If the two groups are dependent, the relationship between them needs further differentiation: (2) The two correlations can be overlapping (ρ A12 = ρ A23 ), i.e., the correlations have one variable in common. ρ A12 and ρ A23 refer to the population correlations in group A between variables 1 and 2 and variables 2 and 3, respectively. For instance, a researcher may be interested in determining whether the correlation between anxiety and extraversion is smaller than between anxiety and diligence within the same group A. (3) In the case of two dependent correlations, the two correlations can also be nonoverlapping (ρ A12 = ρ A34 ), i.e., they have no variable in common. This case applies, for example, if a researcher wants to determine whether the correlation between anxiety and extraversion is higher than the correlation between intelligence and creativity within the same group. A researcher also faces nonoverlapping dependent correlations when investigating whether the correlation between two variables is higher before rather than after a treatment provided to the same group.

For each of these three cases, various tests have been proposed. An overview of the tests for comparing independent correlations is provided in Table 1, and for comparing dependent correlations—overlapping and nonoverlapping—in Tables 2 and 3, respectively. May and Hittner [9] compared the statistical power and Type I error rate of several tests for dependent overlapping correlations, and found no test to be uniformly preferable. Instead, they concluded that the best choice is influenced by sample size, predictor intercorrelation, effect size, and predictor-criterion correlation. Because no clear recommendation for any of these tests can be formulated that applies under all circumstances, and because different methods may be optimal for a research question at hand, it is important that researchers are provided with a tool that allows them to choose freely between all available options. Detailed discussions of the competing tests for comparing dependent overlapping correlations are given in Dunn and Clark [10], Hittner, May, and Silver [11], May and Hittner [9], Neill and Dunn [12], and Steiger [13]. For the case of dependent nonoverlapping correlations, the pros and cons of various tests are discussed in Raghunathan, Rosenthal, and Rubin [14], Silver, Hittner, and May [15], and Steiger [13]. In contrast to most other approaches, Zou [16] has advocated a test that is based on the computation of confidence intervals, which are often regarded as superior to significance testing because they separately indicate the magnitude and the precision of an estimated effect [17, 18]. Confidence intervals can be used to test whether a correlation significantly differs from zero or from some constant, and whether the difference between two correlations exceeds a predefined threshold. Zou’s confidence interval [16] is available for comparisons of independent and dependent correlations with either overlapping or nonoverlapping variables. The tests proposed by Zou [16] have been compared to other confidence interval procedures by Wilcox [19].