The publication ‘Learning from mistakes in climate research’ is the result of a long-winded story with a number of surprises. At least to me.

I have decided to share this story with our readers, since it in some aspects is closely linked with RealClimate.

The core of this story is the reproduction and assessment of controversial results, and it has unfolded with this publication.

Almost at the same time, discussions from the session reproducibility at the Royal Society meeting on the Future of Scholarly Scientific Communication were released. Similarities may suggest that my story is not unique.

The story I want to share started in 2012, after a response to my blog post here on RealClimate and a plea to publish formal rebuttal of Humlum et al. (2011).

Rather than assessing one specific paper, however, I wanted to know Why are there conflicting answers concerning climate change in the scientific literature?

So what is the best strategy for answering this question? I started off with replicating the past analyses, and both the results and the code (Mac/Linux & Windows) for doing the analysis have been made openly available on Figshare.com. It is important that the replication also is replicable.

I also managed to assemble a team for writing this paper, which also included people from SkepticalScience.

I was naive at first, thinking that we could persuade with the provision of open source code and detailed description of how a study is invalid. But it is not uncommon that the publication process is long-winded, as Bill Ruddiman explains in A Scientific Debate.

We first submitted our work to a journal called ‘Climate Research’.

The opinion of one of the reviewers on our manuscript was “profoundly negative”, with a recommendation to reject it (29 June 2012):

“The manuscript is not a scientific study. It is just a summary of purported errors in collection of papers, arbitrarily selected by the authors.”

But what does it mean being a “scientific study”? Perhaps not very well-defined, and google only returned one hit when I searched, with a vague description on Wikipedia.

A clue about the lack of science could be gleaned from the reviewers comment:

“It is also quite remarkable that all the papers selected by these authors can be qualified in some way or another as papers that express skepticism to anthropogenic climate change. I wonder why is this so?”

The same reviewer, also observed that “I guess that any one of us could collect their own favorite list of bad papers. My list would start with the present manuscript“, and remarked that “It may be published in a blog if the authors wish, but not in a scientific journal”.

That’s an opinion, and perhaps a reaction caused by expecting something different to what our paper had to offer. Apparently, our paper did not fit the traditional format:

“This manuscript itself is not a research paper but rather a review, more in the style that can be found nowadays in internet blogs, as the authors acknowledge.”

Because we disagreed with the view of the anonymous reviewers, we tried the journal ‘Climatic Change’ after some revisions (26 July 2012).

When the verdict came, we were informed that it had been very difficult to decide whether to accept or reject: “We have engaged our editorial team repeatedly (as well as a collection of referees), and the decision was not unanimous.”

The manuscript was rejected, and we were in for a surprise when we learned the reason the editor gave us:

“Nonetheless, we have agreed with reviewers who have offered some serious sources of concern and have not been persuaded (one way or the other) by blog conversations. Some of the issues revolve around the use of case studies; others focus on the appropriateness of criticizing others’ work in a different journal wherein response would not be expected.”

We were not entirely discouraged as the editor said the board was “intrigued” by our arguments, and suggested trying a different journal.

After rejection by ‘Climatic Change’, we decided to try an open discussion journal called ‘Earth System Dynamics Discussion’ (ESDD), where the manuscript and supporting material were openly accessible and anybody could post comments and criticism.

It was from the discussion on ESDD that I learned that the critical reviewer swaying the decision at ‘Climate Research’ or ‘Climatic Change’ had been Ross McKitrick (link). He was an author of one of the papers we included in our selection of contrarian papers that we had replicated.

McKitrick had apparently not declared any conflict of interest, but had taken on the role as a gatekeeper, after having accused climate scientists to do so himself in connection with the hack and the so-called “ClimateGate” incident:

“But academics reading the emails could see quite clearly the tribalism at work, and in comparison to other fields, climatology comes off looking juvenile, corrupt and in the grip of a handful of self-appointed gatekeepers and bullies”.

Nevertheless, ESDD (May 3 2013) turned down our manuscript for publication in the final and formal version ‘Earth System Dynamics’ (ESD), and the editor of ESD thought there were several problems with our manuscript: its “structure“ since there was “no paper in this paper”, “no actual science (hypothesis, testing of a hypothesis) in the main body”, that the case studies were “inflammatory and insufficiently supported fashion”, “authors’ stated opinion”, and that R-scripts do “not reveal mistakes”.

I guess we had not explained carefully enough the objective of our paper, and again, the editor expected a different type of paper. We also thought the verdict from ESDD was unfair, but it is not uncommon that authors and editors have different opinions during the reviewing process.

The paper was revised further according to many of the reviewers comments, explaining more carefully some of the critical points and taking into account the counter-arguments. Since it didn’t fit into the format of ESD, we thought it could be more suitable for the journal called ‘Nature Climate Change‘ (6 February 2014).

It was rejected, surprisingly despite quite positive reviews.

One reviewer thought it had “potential of being a very important publication”, and found the “information in the Supplementary Material to be important, compelling and well-presented”.

We were pleased by the view that it was “an important contribution to the climate science debate”, but was intrigued by the response that it was “unlike any other paper I have been asked to review in the past”.

Another comment also suggested that our work was fairly unique: “Reviewing the manuscript has been an unusual assignment”. Nevertheless, the tone was quite positive: “The manuscript is clearly written”, noting the effort going into reproducing the controversial analyses: “extensive discussion of the specific replication attempts in the supporting material, including computer code (written in R) that is available to the reader.”

But another reviewer took a different stance, thinking our manuscript was “poorly written” and that it failed “to adequately capture the importance of the project or convey its findings in an interesting and attractive manner”.

Since the reviews actually were quite encouraging in general, and we decided not to give up. We thought our paper could be suitable for another journal called ‘Environmental Research Letters’ (ERL).

The rejection from ERL came fairly promptly (21 February 2014) with the statement that our paper was an “intriguing” but “not sufficiently methodologically based for consideration as an ERL letter”.

The reviewer thought the manuscript was not suitable for the journal, as it was “more of an essay than a scientific study”, and recommended some kind of perspective-type outlet. Reason for rejection: (a) “not a research Article in the ERL style” and (b) “number of methodological concerns”.

It appeared that the greatest problem with our paper was again its structure and style, rather than the substance, however. I therefore contacted the editor of the journal ‘Theoretical and Applied climatology’ (TAAC) in May 2014 and asked if they would be interested in our manuscript.

The TAAC was interested. It has now been published (Benestad et al., 2015).

So what was special with our manuscript that was “intriguing”, “unlike any other paper“ but did not fit the profile of most of the journals we tried?

Not only was Humlum et. al. (2011) rebutted, but we had examined 38 contrarian papers to find out why different efforts give conflicting results. We drew a line at 38 papers, but could have included more papers too.

Many of these have been discussed here on RealClimate and on Skeptical Science, and are the basis for think tanks such as the Heartland Institute and their output such as the “NIPCC”. The relevance for our readers is that many of these have now formally been rebutted.

We had been up-front about our work not being a statistical study because it did not involve a random sample of papers. If we were to present it as a statistical study, then itself would be severely flawed as it would violate the requirement of random sampling.

Instead, we specifically chose a targeted selection to find out why they got different answers, and the easiest way to do so was to select the most visible contrarian papers.

Of course, we could have replicated papers following the mainstream, as pointed out in some comments, but that would not address the question why there are different answers.

The important point was also to learn from mistakes. Indeed, we should always try to learn from mistakes, as trial and error often is an effective way of learning. There must also be room for disagreement and scholarly dialogue.

Our selection suited this purpose as it would be harder to spot flaws in papers following the mainstream ideas. The chance of finding errors among the outliers is higher than from more mainstream papers.

Our hypothesis was that the chosen contrarian paper was valid, and our approach was to try to falsify this hypothesis by repeating the work with a critical eye.

Colleagues can know exactly what has been done in our analyses and how the results have been reached with open-access data code such as R-scripts that were provided. They provide the recipe behind the conclusions.

If we could find flaws or weaknesses, then we would be able to explain why the results were different from the mainstream. Otherwise, the differences would be a result of genuine uncertainty.

Everybody makes mistakes and errors some times, but progress is made when we learn from trial and error. A scientists job is to be to-the-point and clear as possible; not cosy up to colleagues. So, is is really “inflammatory” to point out the weakness in other analyses?

Hence, an emphasis on similarities and dissimilarities between the contrarian papers was a main subject in our study: Are there any similarities between these high-profile “contrarian” papers other than being contrarian?

So what were our main conclusions?

After all this, the conclusions were surprisingly unsurprising in my mind. The replication revealed a wide range of types of errors, shortcomings, and flaws involving both statistics and physics.

It turned out that most of the authors of the contrarian papers did not have a long career within climate research. Newcomers to a scientific discipline may easily err because they often lack tacit knowledge and do not have the comprehensive overview. Many of them had authored several of the papers and failed to cite relevant literature or include relevant and important information.

The motivation for the original plea for a formal rebuttal paper was that educators should be able to point to the peer-reviewed literature “to fight skepticism about the fundamentals on climate change”.

Now, educators can also teach their students to learn from mistakes through replication of case studies.

The important question to ask is where does the answer or information come from? If it’s a universally true result, then anybody should get similar answers. It is important to avoid being dogmatic in science.

References