Quantifying the consensus on anthropogenic global warming in the scientific literature by John Cook and a large number of contributors to the website Skeptical Science (Cook et al 2013), looked at 11,944 papers over a 21 year period and assigned each to one of three categories on the basis of the papers’ abstracts: endorse, reject, or take no position on the consensus. Of the papers that either endorsed or rejected the consensus, 97.1% of the papers and 98.4% of the papers’ authors endorsed the consensus. In addition, 1200 authors of the analyzed papers were contacted and asked to self-rate their own papers for level of endorsement. Of the self-rated papers that either endorsed or rejected the consensus, 97.2% of the papers and 96.4% of the authors endorsed the consensus.

Cook et al 2013 represents the largest study to date of the consensus among the scientific community regarding the industrial nature of climate disruption (where human activity, primarily the burning of fossil fuels, is the dominant cause of the observed global warming). Prior studies such as Doran and Zimmerman 2009 and Anderegg et al 2010 had found that approximately 97% of climate experts and “super-experts” agreed that climate disruption was caused by human activity. However, some critics had attacked the studies for small sample sizes (Doran and Zimmerman 2009) or for using Google Scholar (Anderegg et al 2010) instead of the “official” scientific database, the ISI Web of Science. Cook et al 2013 addresses both criticisms by using a large sample of 11,944 papers from 1980 different journals and by using only peer-reviewed papers identified in the ISI Web of Science.

Cook et al 2013 explains why this result is expected. Specifically, when a controversial subject has been accepted and is no longer controversial, scientists move on to other subjects and no longer feel the need to explicitly endorse the consensus position. For example, scientists no longer argue about the general accuracy of the law of gravity, so there’s no point in restating why they think that gravitation applies except in unusual cases. Add the fact that abstracts are usually strictly limited in length and adding a few extra words to explicitly endorse the scientific consensus on climate disruption is a luxury most abstracts can’t afford.

The authors who responded to the request to self-rate their papers provide additional clarity to the abstract-only ratings performed by Cook et al 2013. First, the authors made their ratings based on the entire paper, not just the abstract, and so they are better positioned to claim whether or not their paper endorses the consensus or not. Second, the self-ratings also provide a way to measure how much effect just rating the abstract has on the results, and the impact is significant. Cook et al 2013 compared the self-rated papers directly with the abstract-rated papers and found that the number of endorsing papers increased from 36.9% in the abstract-only ratings to 62.7% in the author self-ratings (see Cook et al 2013 Table 5 for more information).

And third, the self-rated papers provides some evidence that the large number of papers categorized as “no position” are categorized that way because the consensus position is no longer controversial. If the position that human activity was the dominant driver of climate disruption was still controversial among scientists, then that would be more likely to be stated in the abstract.

There are a few main areas of uncertainty in Cook et al 2013. The first is the aforementioned issue with short abstracts, but as mentioned above, the self-rating process minimizes this concern. The second is that using a “crowdsourcing” methodology using predefined categories is still ultimately subjective and could be influenced by the biases of the reviewer. However, this effect was minimized through using multiple reviewers and through the self-rating scheme. Possible biases toward the consensus position are ruled out by the fact that self-rated papers were more likely, not less, to endorse the consensus. But a possible bias by the abstract reviewers toward the “no position” category was analyzed and found to have minimal effect on the final results.

The third and final uncertainty is whether or not the papers selected are representative of the overall sample. The large sample size (11,944 papers) is suggestive of representativeness (the larger the sample, the more likely it is to be representative), but doesn’t guarantee it. As Cook et al 2013 points out, there are nearly 130,000 papers with the keyword “climate” in the ISI Web of Science.

However, the highly skewed results of Cook et al 2013 strongly suggest that the results are broadly applicable. The more skewed the results are, the smaller the sample size needs to be in order to accurately deduce the opinions of a population. As I demonstrated in this response to Joe Bast, President of The Heartland Institute, the results of Doran & Zimmerman 2009 had a margin of error of only 3.5% (for a hypothetical sample size of 100,000 scientists). Alternatively, Doran & Zimmerman 2009 could have statistically deduced a 97% consensus using only 39 respondents, not the 79 they actually had.

The results of Cook et al 2013 are even stronger because the sample size is so much larger. Cook et al 2013 found that 98.4% of the authors of the 4,014 papers that endorsed or rejected the consensus. That’s 10,188 authors vs. 168. If we assume that there are 100,000 authors publishing on climate disruption topics globally, then the results of Cook et al 2013 have a confidence level of 99.9% and a margin of error of +/- 0.48%. Increasing the number of climate authors to 1 million results in a margin of error at 99.9% confidence level of +/- 0.51%.

Every serious survey of the expert opinion of climate scientists regarding the causes of climate disruption has found the same thing – that an overwhelming number of climate scientists agree that the causes of climate disruption is dominated by human causes. Cook et al 2013 won’t be the final word on the subject by any means, but if “it’s not over until the fat lady sings,” we can fairly say that Cook et al 2013 indicates that she’s started to inhale.

UPDATE

I’ve been thinking about this paper a bit more and I have a few more thoughts about it that I didn’t include above.

First, in the discussion about sources of uncertainty in the analysis, Cook et al 2013 discusses the representativeness of the sample size. But something that isn’t discussed or mentioned in the Supplementary Information that I can find is a discussion of the representativeness of the paper authors who responded to requests to self-rate their own papers. Generally speaking people who respond to polls are the most energized by the questions being asked, so we could reasonably expect that the scientists who responded would be most likely to either endorse or reject the consensus. But it’s a relatively minor point.

Second, I feel that there was insufficient explanation of the 66.2% of abstracts that were rated “no position.” I would have preferred a few more sentences explaining why scientists don’t explicitly endorse or reject a consensus position, or maybe some attempt on the part of the authors to estimate the degree of consensus among the “no position” abstracts. For example, an analysis could have been done to cross-reference authors of the “endorsing” abstracts with co-authors in the “no position” abstracts and in the process develop a subcategory of “endorsement via co-authorship.” Or a bit more time could have been spent on the Shwed and Bearman 2010 study, which Cook et al 2013 references but doesn’t explain in much detail.

Shwed and Bearman 2010 looked at five historical (20th century) cases, including industrial climate disruption, where a scientific consensus developed and analyzed citation networks among peer-reviewed studies over time. What they found was that, as a consensus developed more and more papers cited a common core of studies that formed the nucleus of the consensus. In addition, Shwed and Bearman 2010 found that consensus leads to a dramatic increase in the number of publications, even as the number of references to the seminal studies remains constant. They describe the rationale as follows:

If consensus was obtained with fragile evidence, it will likely dissolve with growing interest…. If consensus holds, it opens secondary questions for scrutiny.

Essentially, once a consensus on the “big questions” is reached, scientists are free to dive into the details and argue over those instead.

The Shwed and Bearman 2010 analysis found that industrial climate disruption hit this consensus point sometime around 1991, by the way.

There is a lot of work that could be done still with the Cook et al 2013 dataset. I look forward to reading more about it.

Here’s a short list of links to several other sites and news articles about this study: