Educators view critical thinking as an essential skill, yet it remains unclear how effectively it is being taught in college. This meta-analysis synthesizes research on gains in critical thinking skills and attitudinal dispositions over various time frames in college. The results suggest that both critical thinking skills and dispositions improve substantially over a normal college experience. Furthermore, analysis of curriculum-wide efforts to improve critical thinking indicates that they do not necessarily produce incremental long-term gains. We discuss implications for the future of critical thinking in education.

Educators, policymakers, and employers have demonstrated a sustained interest in teaching critical thinking, as both an important life skill and an asset to the future workforce (Koenig et al., 2011). This interest is particularly evident in college, where critical thinking has gained traction as a crucial component of general education (Arum & Roksa, 2011; Halpern, 2001). A recent study reported that faculty endorsed teaching critical thinking as the most important goal of undergraduate education, with over 99% describing it as “very important” or “essential” (DeAngelo et al., 2009, p. 3). Critical thinking is also viewed as an important component of many medium- and high-complexity jobs (Peterson et al., 1997). However, despite the value placed on teaching critical thinking, the actual effectiveness of college at doing so—through either explicit instruction or general exposure—remains a disputed point.

Numerous interventions have been tested to increase both critical thinking skills and the general disposition toward critical thinking. McMillan (1987) reviewed the literature and concluded that specific interventions to foster critical thinking were not well supported but college attendance itself did appear to have a bolstering effect. Later research has suggested that some college experiences may actually affect gains on critical thinking. Terenzini, Springer, Pascarella, and Nora (1995) reported that both out-of-class experiences, such as number of unassigned books read, and in-class variables, such as type of courses taken, explained significant variance in critical thinking ability after a year of college. In a meta-analysis, Abrami et al. (2008) also found a modest but positive effect for critical thinking interventions. In addition, the researchers found a large amount of variability in effect sizes, much of which was attributable to the type of intervention and the degree of implementation. This finding suggests that larger gains on critical thinking might be achieved by focusing on the most effective types of intervention, should current gains be deemed inadequate. Finally, in an unpublished meta-analysis, Ortiz (2007) found larger mean effect sizes in samples where a critical thinking intervention was employed.

Even without explicit attempts to foster critical thinking, there is certainly a widespread perception that college breeds critical thinkers. Tsui (1998) reported that 92% of students in a large multi-institution study believed they had made some gains in critical thinking, and 39.3% thought their critical thinking had grown much stronger. Only 8.9% believed it had not changed or had grown weaker. However, it is not clear whether students share a common definition of critical thinking and whether they are capable of an accurate self-assessment. The actual effects of college on global critical thinking remain unclear. First, there is disagreement about the magnitude of critical thinking gains over the course of college. Second, there is disagreement about whether the concept of global critical thinking makes sense in the first place. This study synthesizes effect sizes to estimate the magnitude of gains on general critical thinking measures and provides a theoretical basis for interpreting these results.

Changes in Critical Thinking Across Majors Some have suggested that certain majors may produce larger gains in critical thinking than others. If this were the case, an analysis of majors that produced the strongest gains would be useful. First, it would suggest the magnitude of gains that could be achieved with the correct curriculum (although it may be possible to improve upon even the best current curriculum). Second, understanding the features that distinguish gain-producing majors would inform future attempts to improve critical thinking in other majors. The current evidence on differences between majors is inconclusive. Pascarella and Terenzini’s (2005) review failed to find strong evidence for differential gains across majors. By contrast, Ortiz’s (2007) meta-analysis suggests that philosophy students may learn more critical thinking than other students. Ortiz estimated gains of 0.26 SDs per semester for philosophy students compared to only 0.12 SDs per semester for other majors. However, the number of pure philosophy samples in her analysis is small (k = 6), and the samples appear to be from unpublished studies. In addition, the confidence intervals for the philosophy and nonphilosophy effect sizes show substantial overlap. Ortiz herself suggested that the observed difference may simply be statistical noise. Another likely exemplar of critical thinking instruction is nursing programs. The National League for Nursing (NLN) requires that nursing programs include formal critical thinking training (Adams, Whitlow, Stover, & Johnson, 1996). This requirement makes nursing students a useful comparison group for estimating the long-term effects of sustained instruction in critical thinking. Measuring this long-term change has different implications than measuring the effects of short-term interventions (e.g., adding a critical thinking component to a single course). For example, short-term critical thinking instruction may give students an initial advantage that does not ultimately persist after the posttest, which could occur either because the benefits are temporary or because other students eventually catch up. Such a result would be analogous to recent findings from the Head Start impact study. Puma et al. (2012) found that the Head Start program yielded substantial advantages in preschool children, but most of the benefits dissipated by the third grade. Similarly, it is possible that students in a critical thinking–rich curriculum may enjoy an initial head start that is negated in the long run. Although this possibility would ideally be tested by a true experimental design, such studies are rare (if not nonexistent) in this literature due to the difficulty of enacting an institution-wide curricular experiment. However, a comparison between nursing students and students in other majors is a feasible proxy for such a study. With their strong emphasis on critical thinking, nursing programs can be seen as representative of a college education in which explicit critical thinking instruction is the norm rather than the subject of an occasional intervention. We would therefore expect greater gains on critical thinking in nursing programs to the extent that formal training is incrementally effective. Currently, the literature is lacking any comprehensive comparison of such programs to a more traditional college experience.

Definition and Measurement Issues Another difficulty in the critical thinking literature is defining the construct. The traditional generalist view conceptualizes critical thinking as a broad ability to interpret information and approach problems correctly that can be applied across a wide variety of domains (e.g., McMillan, 1987; Pascarella, 1989). For example, Abrami et al. (2008) defined critical thinking as “the ability to engage in purposeful, self-regulatory judgment” (p. 1102). Researchers have distinguished between critical thinking skills and dispositions, suggesting a meaningful distinction between the ability to think critically and willingness to actually do so. Most measures focus on the skill aspect, with the California Critical Thinking Disposition Inventory (CCTDI) being the main exception (N. C. Facione, Facione, & Sanchez, 1994). Some scholarship has questioned traditional conceptualizations of critical thinking as a broad domain-general skill. McPeck (1984, 1990) argued that critical thinking has been operationally reduced to the ability to analyze arguments. According to this perspective, the ability to reason and think critically is required for a broad range of tasks beyond analysis of logical arguments, such as “finding one’s way home, investing money, fishing, driving a car, doing sums, shopping, playing hopscotch, intelligent voting, building math models, writing poems, and countless other classes of activities” (McPeck, 1984, p. 30). McPeck argued that the ability to think critically about such a broad array of domains is not well represented by any general skill (e.g., analyzing arguments), and therefore critical thinking ability is best conceptualized as domain-specific. Kuncel (2011) argued that when people describe critical thinking skills, they refer to one of two very different things. The first is field- or job-specific expertise, which will form only with practice and experience in a field and is not widely generalizable. For example, thinking critically about medical diagnosis in veterinary medicine is hard-earned expertise that does not readily transfer to restaurant management. Kuncel suggested that critical thinking in college should emphasize field-specific expertise at least as much as the second type of critical thinking. The second type, as currently measured by many critical thinking scales, is “a finite set of very specific reasoning skills (e.g., gambler’s fallacy, law of large numbers, correlation vs. causation)” (Kuncel, 2011, p. 2). Although these skills are viewed as useful, it is argued that knowing about the law of large numbers is useful for reasoning about the law of large numbers and nothing else. This narrow definition also calls into question the degree to which critical thinking can be taught in a broad way that will transfer to improved performance across all work, school, or life tasks. In his meta-analytic review, Kuncel found little discriminant validity evidence for commonly used critical thinking tests and other cognitive ability assessments. In addition, the evidence suggested that these tests (and gains on them) are unlikely to predict grades or job performance more effectively than common measures of IQ or general cognitive ability (although they may be useful for specific tasks). Despite questions about the scope of critical thinking skills, most researchers have argued that critical thinking tests do measure useful traits (e.g., P. A. Facione, 2011). Even if critical thinking skills are domain-specific, the specific reasoning skills measured by commonly used tests are likely to produce more informed consumers of information (Kuncel, 2011). In addition, the attitudinal disposition toward critical thinking is more likely to apply across domains than specific critical thinking skills. If college can promote general skepticism toward questionable claims and ideas, especially ones that mesh with one’s own worldview, it has surely performed a valuable function. Although individuals lack the specific knowledge needed to critically analyze every domain, a disposition toward critical thinking should at least encourage acquisition of additional knowledge and reservation of judgment. Although there is little disagreement that critical thinking is important, teaching it takes time away from teaching other important skills, such as reading and mathematics. Given these trade-offs, it is important to understand the present state of critical thinking in college and what can reasonably be done if it is inadequate. To do this, we must determine whether a normal college education is even effective at teaching critical thinking. In addition, we must estimate the incremental gains over longer periods of time when more resources are devoted to critical thinking instruction. This meta-analytic study will establish average gains on tests of general critical thinking during college, both with and without formal training, and reconcile conflicting views in this domain.

Results We used mixed-effects meta-analysis to model the effects of four moderator variables: time frame (0.5–4 years), study design (cross-sectional = 0, longitudinal = 1), sample3 (nonnursing = 0, nursing = 1), and year of publication (1963–2011). For ease of interpretation and to allow the model to converge, we recentered the publication year variable by subtracting the earliest year in our sample from all values. Conceptually, this means that a value of zero for publication year corresponds to the year 1963. To test for nonlinear gains in critical thinking over time (e.g., larger or smaller gains during the entirety of college than would be expected from rescaling effect sizes from a single semester), we also included a quadratic term for time frame. Initial analyses indicated that the nursing/nonnursing moderator did not reach statistical significance. Additionally, a reduced model without this moderator produced lower values of the Aikake information criterion and Bayesian information criterion, indicating better fit. Therefore, we present results for the reduced model in the interest of simplicity. Results are presented for changes in critical thinking skills across moderators in Table 5. We found evidence for significant moderator effects, Q(4) = 9325.1340, p < .0001, as well as significant residual heterogeneity, Q(105) = 716.1532, p < .0001. As expected, gains were larger across longer time frames. However, only the quadratic effect of time frame reached statistical significance. This finding suggests that gains in critical thinking skills during college are nonlinear, with the rate of change increasing over larger time intervals. An important caveat when interpreting this finding is that the time frames in many studies could not be attributed to specific years of college. For example, studies of gains during semester-long courses combine data from freshmen, sophomores, juniors, and seniors enrolled in those particular courses. As a result, the significant quadratic term is not a clear indicator that critical thinking gains accelerate in the later years of college. Table 5 Meta-analysis of critical thinking skill studies View larger version To follow up on this possibility, we analyzed a subset of 76 effect sizes for which we could code a specific starting year (e.g., freshman = 1). In this subset, we compared three models: the original model, the original model with starting year added as an additional moderator, and the starting year added model with the quadratic time frame term removed. The purpose of removing the quadratic term was to determine if it was accounting for variance in effect sizes that would otherwise be attributed to starting year. Coefficients from all three models are presented in Table 6. The effect of starting year did not reach significance in either Model 2 or Model 3, suggesting that there may not be differential gains across different years of college. Table 6 Meta-analysis of critical thinking skill studies including starting year in school View larger version We also analyzed differences between longitudinal and cross-sectional studies in our full sample. Our results suggest that study design has a substantial influence on effect size estimates. Controlling for other moderators, using a longitudinal design as opposed to a cross-sectional design was associated with a reduction of 0.27 SDs in estimated gains on critical thinking skills. To better represent the joint effects of time frame and study design, we calculated model-predicted values for different levels of each moderator with publication year set to the sample mean (approximately 1994). The results in Table 7 show substantially larger effects for cross-sectional studies and longer time frames. We also predicted effect sizes for a hypothetical mixed sample with an equal number of cross-sectional and longitudinal studies. This analysis produced a 4-year gains estimate of 0.59 SDs, as opposed to 0.73 for cross-sectional studies and 0.46 for longitudinal designs. Table 7 Model-predicted values for time frame and study design moderators with changes in critical thinking skills as the outcome View larger version The compilation of studies spanning 48 years also allowed us to examine changes in critical thinking gains over time. To test the suggestion that college has become less effective at teaching critical thinking, we included publication year as a moderator. Our moderator analysis provides some support for this notion. Holding other moderators constant, more recent studies provided significantly smaller effect sizes than older studies. Given an equal mix of cross-sectional and longitudinal studies, the predicted 4-year gain is 1.22 SDs for a study published in 1963 (80% credibility interval [0.75, 1.68]), whereas the predicted gain is only 0.33 for a study published in 2011 (80% credibility interval [−0.11, 0.78]). Our search revealed a relatively small sample of studies measuring the disposition toward thinking critically, all of which used the CCTDI. Given the small meta-analytic sample, we tested simpler moderator analyses excluding sample type and publication year. Using Aikake information criterion and Bayesian information criterion as criteria, a simple model with only the linear effect of time frame produced the best fit. This model is presented in Table 8. As expected, longer time frames were associated with larger effect sizes, Q(1) = 22.8770, p < .0001. However, significant residual heterogeneity remained after accounting for this effect, Q(12) = 52.1465, p < .0001. Predicted values for different time frames are shown in Table 9. We estimate an average gain of 0.55 SDs on the disposition toward critical thinking over 4 years of college. Table 8 Meta-analysis of critical thinking disposition studies View larger version Table 9 Model-predicted values for different time frames with changes in critical thinking dispositions as the outcome View larger version As previously mentioned, we imputed SDs from other samples for several studies in which no SDs were presented. It is possible that our specific decisions about the sources of imputed values could have affected the overall results of this study. To address this concern, we performed a sensitivity analysis using alternate imputed values. Specifically, we computed new imputed values using N-weighted averages of pretest SDs from the meta-analytic sample. This resulted in three new imputed values for the Watson-Glaser, California Critical Thinking Skills Test (CCTST), and CCTDI. For the studies in Table 2, we replaced our original imputed values with these new values. We then reran the mixed-effects meta-analyses for critical thinking skills and dispositions using the new effect sizes and sampling variances. The new values did not affect our model selection decisions and had trivial effects on the magnitudes of moderator coefficients (see Tables A1 and A2 in Appendix A for these results). Thus, our overall conclusions appear to be robust to changes in imputed SDs.

Discussion Our study suggests that students make substantial gains in critical thinking during college. We estimate the overall effect of college on critical thinking skill at 0.59 SDs, which paints a slightly more optimistic picture than recent estimates by Arum and Roksa (2011) and Pascarella and Terenzini (2005). However, our overall findings are fairly consistent with these studies. Arum and Roksa estimate a gain of 0.18 SDs over three semesters, which falls well within our 80% credibility intervals for both 1-year and 2-year effect sizes. Their estimate of 0.47 SDs over 4 years is nearly identical to our point estimate of 0.46 for 4-year longitudinal studies. Similarly, Pascarella and Terenzini’s overall estimate of 0.50 SDs from a combination of longitudinal and cross-sectional studies is reasonably similar to our mixed designs estimate of 0.59 (especially considering that our effect size estimates are corrected for unreliability whereas theirs are not). It is worth noting that a 0.50 SD gain for person who starts at the 50th percentile would lift him or her to the 69th percentile, no small improvement in our minds. A major contribution of the present study is that we analyze time frame as a moderator rather than aggregating all time frames into a single estimate, providing more information about nonlinear patterns in the growth rate. Our quadratic analysis of the time frame moderator suggests that the rate of gains in critical thinking skills increases across larger time frames. This finding suggests that it may be inappropriate to rescale and collapse effect sizes from different time frames (e.g., Ortiz, 2007). However, follow-up analyses failed to link this effect to specific years in college. Thus, we did not find unequivocal support for an acceleration effect where critical thinking gains become more rapid in the later years of college. On the other hand, our results also do not support Arum and Roksa’s (2011) suggestion that critical thinking may increase more in the early stages of college. Another benefit of our study is that it includes cross-sectional designs, which were excluded or grouped with longitudinal studies in other reviews. Considering these designs is important because they make up a large part of the literature. However, treating them as equivalent to longitudinal designs may lead to erroneous conclusions; we found that cross-sectional studies produced substantially larger effect sizes than longitudinal studies. This finding suggests that critical thinking researchers should carefully consider the effects of their study design on the final results. Cross-sectional studies may be confounded if students who score higher on critical thinking tests are more likely to remain in college. The result would be inflated effect size estimates, as low-performing students contribute only to the pretest mean before dropping out. Since critical thinking is related to college performance (Kuncel, 2011), this confound is a possibility. On the other hand, longitudinal studies suffer from a similar self-selection problem. Longitudinal samples are typically restricted to students who remain in college for both data collections. As a result, effect size estimates may be downwardly biased by range restriction, a statistical artifact that results from artificially reduced variance in the outcome of interest (e.g., Bobko et al., 2001). In addition, students who initially score low on critical thinking may simply have more room to improve than their high-ability counterparts. The degree to which this is problematic largely depends on the research question of interest. If a researcher wants to know the effect of college on students who stay in college, then the range restriction issue is not a concern. However, it should be recognized that longitudinal effect sizes are likely to be larger if the dropout rate is reduced. Another somewhat worrisome finding is that observed gains in critical thinking appear to have deteriorated over time despite increased interest in fostering critical thinking skills. This finding is not conclusive evidence that college educations have declined in quality because there are a number of possible explanations for the observed effect. First, it is worth noting that a large amount of residual variance in effect sizes remained after our moderators were accounted for. Thus, publication year may be acting as a proxy for some missing variable that also changed over time. Such a variable could either directly affect gains on critical thinking or affect the measurement of gains. For example, changes in curricula or student behaviors could bring about reductions in the effectiveness of college (e.g., Terenzini et al., 1995). On the other hand, changes in study design or the quality of studies might affect observed effect sizes in ways not accounted for by our single design moderator. Such differences would be artifactual rather than representing true changes in how well critical thinking is learned. It is worth noting that our estimated 4-year gain for a study published in 1963 is 1.22 SDs. Such an effect size is perhaps suspiciously large and may partially reflect lower standards for research design and implementation. Another potential explanation is that students are now coming to college with a reduced readiness to learn critical thinking skills. One reason for this may be that students have increasingly learned more critical thinking skills before entering college. If the skills taught in college are already present in a greater proportion of students, then overall gain scores should be reduced. Alternatively, college attendance has increased over time, and many new students may not be sufficiently prepared to learn more complex reasoning skills. A final possibility is that students have become less willing or able to learn critical thinking skills over time. As of now, these possibilities are mere speculation. Further research is needed to determine the true causes of this phenomenon. Implications Although college education may lag in other ways, it is not clear that more time and resources should be invested in teaching domain-general critical thinking. For a specific group of individuals who already possess above-average cognitive abilities, a gain of 0.59 SDs on a purportedly general ability is quite impressive (comparable to going from the 50th percentile to the 72nd percentile). The effect of college on critical thinking is larger than the average effect of educational variables on academic achievement (0.40 SD) and even rivals the effect of disposition toward learning (0.61 SD; Hattie, 1992). Put differently, college appears to produce critical thinkers about as well as motivation produces good students. College also appears to foster more favorable dispositions toward critical thinking. As an attitudinal construct, critical thinking disposition is arguably even less trainable than critical thinking skill. However, it does appear that the college experience can have a substantial impact. Our 4-year gain estimate of 0.55 SDs is not markedly different from Pascarella and Terenzini’s (2005) estimate of 0.50 (although there is some overlap between our disposition samples). This finding is particularly important because critical thinking disposition may be the only domain-general form of critical thinking, in that a willingness or desire to question and critique is clearly applicable across settings. The average increase of over half an SD on this general disposition is encouraging. When considering educational interventions, the amount of value added relative to other potential investments must be a central consideration. Time spent teaching critical thinking is time not spent teaching other things, such as reading, writing, mathematics, and profession-specific knowledge. If our efforts to foster critical thinking are inadequate, then the same surely holds for these domains in which observed gains are similar to gains in critical thinking. Pascarella and Terenzini (2005) estimate 4-year gains of 0.77 SDs, 0.62 SDs, and 0.55 SDs for English, science, and mathematics skills, respectively, which are similar to our estimates for critical thinking gains. It is also clear that there is significant room for improvement in these fundamental competencies. Students in the United States score only around the average among Organization for Economic Co-operation and Development (2010) member countries on reading and science skills, and they are actually below average at mathematics. Given the skill deficiencies that exist in multiple areas within the labor force (Galagan, 2010), we must carefully consider where we invest educational resources. It is unlikely that additional investment in domain-general critical thinking will provide a solution to our problems. Our analysis of nursing samples failed to find any long-term advantage of the NLN’s critical thinking requirement; nursing students simply did not improve more than their nonnursing counterparts. Although Abrami et al. (2008) found an average effect size of 0.34 for critical thinking interventions, the nursing data suggest that such interventions may ultimately have little incremental impact above and beyond the gains that naturally occur over the span of college. Limitations and Future Directions The central limitation of the literature we synthesize is the inability to make clear causal conclusions, a limitation that is problematic in two ways. First, the studies reviewed do not distinguish the effects of college from ordinary maturation effects, a persistent problem in this body of research (Pascarella & Terenzini, 1991). There is some evidence that the observed effects are largely due to college. Pascarella and Terenzini (1991, 2005) reviewed studies that control for maturation and other confounds, and they concluded that college still produces significant effects. However, it may still be the case that critical thinking increases naturally with age and that some of the observed changes occur independently of college education. Of course, a true experimental design is still lacking in the literature. Since it is not realistic to randomly assign participants to attend or not attend college, a degree of uncertainty is likely to persist. A second issue is that nursing differs from other majors in ways other than attention to critical thinking. One must consider the possibility that other aspects of nursing education or nursing students could cause differential gains in critical thinking skills. If some unknown feature of nursing programs were to suppress critical thinking, then the effect of the curricular difference might be masked. In other words, nursing programs’ additional focus on critical thinking could actually be alleviating a relative deficit that would otherwise exist. Therefore, we cannot rule out the possibility that a critical thinking curriculum could produce lasting incremental gains. At the very least, our study simply shows that such gains—if they exist—are not readily apparent despite a large literature devoted to searching for them. As previously mentioned, it is not highly feasible to randomly assign participants to a long-term critical thinking curriculum. Therefore, the analysis of nonequivalent comparison groups is arguably the best evidence that is presently available. The evidence from nursing samples casts doubt on the amount of value added by explicitly training domain-general critical thinking in college. The critical thinking literature could benefit from a change in focus to incorporate domain-specific critical thinking. Our literature search revealed relatively few studies that measured critical thinking in a specific content domain and even fewer that compared specific and general measures. Most of those we did find were tests of critical thinking in nursing or psychology. Without sufficient studies of critical thinking in other domains, we are unable to draw generalizable conclusions about changes in critical thinking skills. If the logic items on the Watson-Glaser and other common tests represent only one domain out of many in which one can think critically, then the current literature has largely ignored a crucial aspect of critical thinking. It is plausible that domain-specific measures would show stronger gains in college and track better with important outcomes. For example, Renaud and Murray (2008) compared gains on domain-general and domain-specific critical thinking tests following a brief intervention. Students in both experimental and control conditions read a passage about personality theory. The experimental group then completed critical thinking questions about the passage, whereas the control group completed simple recall questions. Participants in the experimental group showed larger gains in domain-specific, but not domain-general, critical thinking. Another experiment by Williams, Oliver, and Stockdale (2004) indicated that psychology students showed significant gains on a measure of psychological critical thinking when critical thinking practice was incorporated into the course. However, there were not significant gains on the Watson-Glaser for either experimental or control groups. Williams (2003) found that students in an educational psychology course showed more significant gains on psychological critical thinking than domain-general critical thinking. In particular, students who received high grades in the course improved more on the domain-specific test. This finding suggests that changes in domain-specific critical thinking may be related to mastery of that domain. The domain-specificity hypothesis could also explain the failure of nursing programs to produce larger gains in critical thinking through explicit instruction. The type of critical thinking infused into current nursing curricula may not be captured well by traditional measures. General critical thinking inventories measure the ability to employ a specific set of logical rules, but these rules are not necessarily the ones used to think critically about the condition of a patient or which treatment is appropriate. Nursing curricula may focus on teaching critical thinking rules that are useful to nurses but not as useful for increasing scores on the Watson-Glaser. Alternatively, nursing education may not be conducive to retention of the skills taught in a general critical thinking lecture. Skills are typically more likely to be retained if they are practiced (Campbell & Kuncel, 2001), and it is unlikely that the day-to-day experience of nursing education affords much explicit practice at recognizing post hoc fallacies or using modus ponens. The domain-specificity hypothesis suggests that critical thinking skills taught in one domain (e.g., formal logic) are unlikely to transfer well to another (e.g., nursing; McPeck, 1984). Under this paradigm, it is unsurprising that nursing students would fail to apply (and thus retain) the ability to analyze formal arguments. The retention and application of critical thinking gains beyond college also require further study. The present study demonstrates that college students learn critical thinking skills, but this does not guarantee that they retain these skills long after college or apply them in other contexts (Campbell & Kuncel, 2001; Schmidt & Bjork, 1992). Our search did not reveal any studies that followed up with college graduates to determine their levels of critical thinking skill or disposition later in life. If critical thinking skills are not practiced as frequently after graduation, they may diminish over time. Similarly, it is likely that the disposition toward critical thinking is influenced by the norms and environment of college, which generally promote open-mindedness and other intellectual virtues. It may be the case that a departure from the college environment would be associated with regression toward the mean of critical thinking dispositions. To the best of our knowledge, the above possibilities have never been formally tested. Conclusion Our initial results argue against investing additional time and resources in teaching domain-general critical thinking. Although the set of specific skills measured by critical thinking tests is important, spending more time on them involves trade-offs with other important skills. The evidence suggests that basic competencies such as reading and mathematics are more amenable to improvement beyond the gains currently observed, and the need is arguably more desperate. In addition, critical thinking in major-related domains may be a more practical target for instruction than the kind of critical thinking measured by domain-general tests, although further research is needed to explore this possibility. Regardless, our findings should not be a cause for pessimism about the future of critical thinking in higher education. On the contrary, the present study has demonstrated that college is already effective at fostering critical thought, leaving more resources free to pursue other educational goals.

Appendix A Sensitivity Analyses Table A1 Meta-analysis of critical thinking skill studies using alternate imputed standard deviations View larger version Table A2 Meta-analysis of critical thinking disposition studies using alternate imputed standard deviations View larger version

Notes 1

It is worth noting that although the authors provide “best estimates” for these effect sizes, their exact methodology for arriving at these estimates is unclear. They appear to use meta-analytic techniques (Pascarella & Terenzini, 2005, pp. 12, 150). However, they provide a narrative review of several cross-sectional and longitudinal studies and then simply report an overall effect size estimate. Some of these studies do not contain the information necessary to compute an effect size, but the authors attempt to estimate an effect size regardless (p. 157). 2

Two longitudinal studies had unequal sample sizes at different times due to incomplete data for some participants (Ewen, 2001; Johnson, 2002). For these studies, we used an average of the sample sizes at Time 1 and Time 2 to compute the overall sample size estimate. 3

The nonnursing category includes the following majors and categories of major: unspecified/mixed, humanities, liberal arts, science, mathematical and social sciences, social sciences, health science, business, psychology, engineering, architecture/architecture engineering, physical therapy, and prehealth science.