Despite this research on citation analysis (and an entire journal Scientometrics devoted to the topic), no studies have specifically examined citation patterns of rebuttals and their influence on citations of the original paper. To address this gap in rebuttal analysis, we analyzed citation patterns of seven papers in our field (fisheries ecology and management) that have both attracted widespread attention and also been the target of at least one rebuttal (summarized in Table 1 ). Our aim was to determine how effective these rebuttals have been in influencing the views of the scientific community.

To address this question, we analyzed citations of high‐profile papers and their rebuttals. The citation rate of a paper is often regarded as a measure of how important a paper is, since important papers will be cited by many other papers, while less important papers will seldom be cited. Citation counts can be used to compare the prominence of individual papers, scientists, and journals, although multiple factors influence citation rates. For example, citation rates in ecology are influenced by the journal they are published in, article length, study outcome (whether the hypothesis was accepted or rejected), the number of authors, their country, and their university affiliation ( Leimu and Koricheva 2005 ). Papers also vary in how much influence they have when cited. In physics, fully 41% of citations are “perfunctory”: acknowledging that other studies have been conducted, but not contributing to the paper in which they are cited ( Moravcsik and Murugesan 1975 ); while in marine biology one quarter of all citations were made inappropriately ( Todd et al. 2010 ). These cautionary tales teach us not to rely too heavily on citation numbers alone when evaluating a particular paper, researcher, or journal.

How does science progress? A naïve view is that scientists propose new ideas and hypotheses and these are either accepted or rejected according to the evidence at hand. In practice it takes considerable evidence to cause the scientific community to abandon an established idea. Instead, as espoused by Thomas Kuhn, established ideas are continually modified to incorporate findings that appear to falsify their results, until these additions become untenable and a new hypothesis sweeps away the old in a scientific revolution ( Kuhn 1962 ). Imre Lakatos viewed this debate in the light of entire research programs, arguing that hypotheses form the hard core of entire research programs, and are rarely eliminated by contrary evidence ( Lakatos 1978 ). Lakatos proposed that competing research programs around rival hypotheses gain strength, while research programs surrounding the old idea degenerate and fade from popularity. These ideas about the progress of science revolve around active debate about the validity of scientific hypotheses, and raise questions about the role of rebuttals in refuting currently popular ideas. According to the Webster dictionary, a rebuttal “contradicts or opposes by formal legal argument, plea, or countervailing proof.” In science, a rebuttal may offer only an alternative interpretation of the original results, or refute only one part of a study. But in most cases, including the papers we examine here, rebuttals aim to highlight substantial flaws in published papers and act as the first line of defense after scientific research passes the review system. The question we examine is this: how successful are rebuttals at correcting scientific perceptions of the original articles?

There were three groups of papers that cited the original article: (1) the rebuttals, (2) papers that cited both the original article and the rebuttals, and (3) papers that cited the original article but not the rebuttal. To estimate the overall percentage of articles critical of the original paper (scored as a one or a two), we needed to combine our estimates of the proportion of critical articles within these three groups. We did this by dividing the number of critical citations by the total number of citations. Critical citations were estimated by adding up the number of rebuttals (critical by definition), critical papers that cited the original and the rebuttal (we scored all of these), and critical papers that cited the original but not the rebuttal (scaled up from our sample of 20 late period citations to the total number of citations in this group). The resulting equation calculates the estimated proportion of all citations that were critical of the original article:where= number of rebuttals,= citations critical of the original article, that cited the original and the rebuttal,= citations out of the sample of 20 late articles that were critical of the original article, and that cited the original but not the rebuttal,= total citations of the original that did not cite the rebuttal, and= all citations of the original paper.

One final manner in which rebuttals could affect citation patterns would be a detectable decrease in citations of the originals after the publication of the rebuttals. This pattern could occur if some scientists avoided citing studies that have been seriously questioned. To investigate this possibility, we calculated the annual citations of the original article (the “observed” citation frequency), and compared this with annual citations of other articles published in the same year and journal with titles or abstracts referring to words with prefixes fish*, marine*, or ocean* (the “predicted citation frequencies”). The predicted citation frequencies were scaled to the total number of observed citations for comparability, and plotted to detect any deviations.

Statistical significance was assessed, as before, using resampling methods. A counter was incremented if a resample mean from the early scores was lower than a resample mean from the late scores. The counter divided by a large number of trials (100,000) gives the resample P value. A P value less than 0.05 indicates that scores declined significantly over time.

Citation patterns before and after the publication of the rebuttals were compared to test whether citing authors became more skeptical of the original articles over time, even if they did not cite the rebuttals. For this comparison, we scored citations of the original articles that did not cite a rebuttal, and whose authors did not include any predisposed authors. There were too many citations in this category for us to score every citation, so for each of the seven original papers, we randomly sampled 20 citing articles from the first year after the original was published (early group) and 20 from 2009 (late group). If there were fewer than 20 citations in the first year of publication, then citations were successively added from later years until the early group totaled 20. A similar procedure was used, progressively adding citations from years earlier than 2009 if there were fewer than 20 citations in the late group. As before, citations were scored according to the level of agreement with the original, with one indicating complete disagreement and five indicating complete agreement ( Table 2 ). Citations which did not concern the controversial issue were given an N/A.

Since the scores do not follow a statistical distribution, resampling methods were used to assess the statistical significance of excluding rebuttal authors. The test scores of all rebuttal citations were resampled with replacement, and a counter incremented if the mean score of the resulting resample was higher than the observed mean score after excluding rebuttal authors. This process was repeated a large number of times (100,000). The resulting P value is obtained from the counter divided by 100,000. A similar method was used to compare the citation mean scores after excluding rebuttal authors with the test scores after excluding rebuttal and original authors.

We classified each citation of a rebuttal according to level of agreement with the original ( Table 2 ), with a score of one for citations which agreed that the rebuttal refuted the original article, ranging up to a score of five if the citation implied that the original was correct and the rebuttal was in error. During the scoring process we were surprised to discover citations which implied that the rebuttal agreed with the original article, and created a new category six for these citations. (To maintain a score of three as neutral, scores of six were treated as five when calculating averages.) Citations that did not mention the controversial issue or did not cite the original article were scored as N/A. We excluded direct responses to a rebuttal by the original authors and replies to such responses. Since we suspected that our numerical scores would be influenced by whether the citing authors also had authored the original or one of the rebuttals, we classified citations according to whether the citing authors were independent from the original and rebuttal authors, were among the original authors, or were among the rebuttal authors. We will use the term “predisposed authors” to refer to citing authors that included any of the original authors or the authors of any of the rebuttals of a given original paper.

We used the ISI Web of Science database to find out the average number of times that original and rebuttal articles were cited from the date of publication to November 2009. Since this measure is somewhat biased because the original articles have had a longer time to accumulate citations, we also calculated the number of citations per article per year for the originals and the rebuttals. We also examined the impact on citation rates of article length and impact factor of the journal in which rebuttals appeared, as these are correlated with citation rates (e.g., Leimu and Koricheva 2005 ).

Our study comprised five parts: (1) measure the overall impact of original articles compared to rebuttals using citation counts; (2) develop a metric for scoring citations along a continuum from rejection to uncritical acceptance of the original paper; (3) compare citation scores before and after rebuttals were published to see whether citations of the original articles became more critical after the publication of rebuttals; (4) examine the impact of rebuttals on total citation rates of the original articles over time; and (5) calculate the overall proportion of citations critical of the original paper, after the publication of the rebuttals.

Impact of rebuttals on citations of the original article. Annual citations of the original article are shown in blue, and contrasted with the expected pattern of citations for all articles published in the same journal and year. Each rebuttal is depicted by a red cross in the year of publication of the rebuttal.

Citations per year of the original articles are similar to those expected from other articles published in the same journals ( Fig. 2 ). There are no visible declines in citation numbers after rebuttals were published, and in fact the only major deviation from the predicted patterns occurred for Pauly et al. (1998 ), where citations actually increased after the rebuttals were published.

We scored 246 citations that cited the original article and not the rebuttal, and were not authored by any predisposed authors. Of these citations, fully 95% were assigned a score of five, implying whole‐hearted acceptance of the original article ( Fig. 1 ). Additionally, although we had expected that among this group of citations there would be less support for the original article over time, in fact support was unchanged from the early to the late citations, with average scores actually increasing slightly from 4.86 to 4.93 (resample test for late score < early score, P = 0.84).

Score breakdown for (a) papers that cited the original paper and the rebuttal, and (b) papers that cited the original paper but not the rebuttal (random sample of 20 early papers and 20 late papers for each original paper). “N/A” refers to citations that cited the original paper but did not refer to the controversial issue. Scoring criteria are given in Table 2 .

For citations of the rebuttals, scores ranged from one (negative view of the original article) to five (positive view of the original article). The mean score of articles citing the rebuttals was 2.83, which increased significantly to 3.11 (resample test, P = 0.006) after excluding citations by authors of any of the rebuttals of corresponding original articles ( Fig. 1 ). This average declined from 3.11 to 3.02 after excluding citations by authors of the original articles, although this decrease was not statistically significant (resample test, P = 0.19). Thus average rebuttal scores were almost exactly neutral. Amazingly, 8% of citations (scored as six) stated that the rebuttal supported the arguments made in the original article. A final point of interest is that fewer than half of the rebuttal citations referred to specific reasons for the rebuttal; and citations with higher scores listed reasons less frequently: 26% of citations with scores of three, four, five and six.

There were 2982 citations of the original seven articles and only 323 citations of all 24 rebuttals combined ( Table 3 ). On average, there were 426 citations per original article, but only 12.9 citations per rebuttal. Accounting for years since publication did not change this pattern: the original articles averaged 17 times more citations per year than the rebuttals (48.0 vs. 2.9). If we exclude citations of rebuttals which did not cite the original article (indicating that the citing paper was not addressing the point of contention in the rebuttal), the difference is even more drastic: 48.0 vs. 1.6, or a factor of 30 ( Table 4 ). Although all of the original articles appeared in Science or Nature, journals with high impact factors, rebuttals were actually cited more per year if they appeared in other journals than if they appeared in Science or Nature (3.5 vs. 2.0 cites per year). Some of this difference may be due to article length: rebuttals in lower tier journals averaged 7.2 pages compared to 1.7 pages for rebuttals in Science or Nature and 3.6 pages for the original articles. Although there was some influence of article length, it is clear that the huge difference between citation rates of originals and rebuttals was due solely to their very nature as rebuttals.

Discussion

Our results provide strong evidence that rebuttals scarcely alter scientific perceptions about the original papers. For the seven fisheries papers we examined, the original articles were cited 17 times more frequently than the rebuttals, an order of magnitude difference that overwhelms other factors influencing citation patterns, such as time since publication, journal impact factor, and the length of the articles examined. The fact that all of the original articles present a conservation crisis may also be a factor in citation frequency, but could hardly explain such a huge discrepancy. Our test score results emphasize that rebuttals have little influence: even the rare few authors who happened upon the rebuttals were influenced only enough to move from whole‐hearted support of the original article (a score of five) to neutrality (a score of three), despite the fact that all of the rebuttals argue that the interpretations of data in the originals were incorrect. Astonishingly, 8% of the papers that cited a rebuttal actually suggested that the rebuttal supported the claims of the original article, an observation which may give pause to those contemplating writing a rebuttal in the future. For every article that cited the rebuttal, there were 17 that ignored the rebuttal and cited only the original, and among this silent majority, 95% uncritically accepted the findings of the original article. Thus for almost all scientists, except perhaps those that wrote the rebuttals, the existence of rebuttals had no influence on their perceptions of the original article.

Our overall finding that only 5% of all citations are critical of the original articles is small compared to the 14% of citations in physics that disputed the correctness of the papers they cite (Moravcsik and Murugesan 1975). Our number is especially low given that we deliberately examined articles known to be in dispute; we suspect that biological articles lacking rebuttals are accepted with even less critical thought. This confirms our intuitive sense that most authors, except the relative few that are writing and citing rebuttals, tend to accept a paper's conclusions uncritically.

For those convinced that science is self‐correcting, and progresses in a forward direction over time, we offer only discouragement. We had anticipated that as time passed, citations of the original articles would become more negative, and these articles would be less cited than other articles published in the same journal and year. In fact, support for the original articles remained undiminished over time and perhaps even increased, and we found no evidence of a decline in citations for any of the original articles following publication of the rebuttals. In one case, the opposite pattern was observed: citations of Pauly et al. (1998) at the end of the time period were increasing and were substantially higher than expected. Thus the pattern we observed follows most closely the hypothesis of competing research programs espoused by Lakatos (1978): in practice, research programs producing and supporting the views in the original papers remained unswayed by the publication of rebuttals, thus significant changes in these ideas will tend to occur only if these research programs decay and dwindle over time while rival research programs (sponsored by the rebuttal authors) gain strength. To some extent, then, the production of papers and rebuttals are aimed at incoming young scientists, to influence the future strength of competing scientific programs.

Perhaps we should not have been surprised that rebuttals are so seldom cited, and that the perceptions of original articles are little affected by rebuttals. Although no previous studies have been conducted on rebuttals, which are a moderate way of correcting the scientific record, multiple studies on medical papers have been conducted on a more extreme form of correction—retractions—with similar results. For example, Budd et al. (1999) found that 235 retracted articles were cited on average nine times each, and 92% of these citations treated the original article as though it were valid research. More recent studies have found similar patterns. The most troubling was an analysis of 48 article pairs in medicine each comprising an original flawed article, and a corrected and republished article by the same authors, which found that it took at least eight years before citations of the flawed original were significantly lower than the corrected publication (Peterson 2010). These results are obtained from the medical community where the primary aggregator of information, MEDLINE, explicitly links corrections and retractions to the original article, unlike in ecology where no such mechanism exists to automatically link rebuttals to original papers. The overall story emerging from these studies is that flawed and retracted articles are cited at similar rates to unflawed and non‐contentious articles.

Our results indicate that rebuttal authors may to a large extent be wasting their breath. The ideas in the original articles we examined have achieved broad acceptance, and their proponents are undeterred by rebuttals. The implications of this finding for the journal publication system are obvious: If the goal is to effectively disseminate scientific truth, and reflect dissension, then some effort must be made to give rebuttals a greater voice.

One suggestion is online linking of all rebuttals and responses to the original article, so that scientists downloading the original article are alerted to the ongoing discussion about its validity. For example, the Canadian Journal of Fisheries and Aquatic Sciences provides links to corrigenda, rebuttals, and replies to rebuttals in the same line as the original article. The articles we examined all came from Science and Nature. Science article downloads do include a cover page with links to related articles, but related articles include a large number of links and do not always include rebuttals and responses, especially those published in other journals. Similarly, Nature typically includes four links with each article; clicking on “first paragraph” or “full text” leads to a separate page which includes a link to Brief Communications Arising, but clicking on “pdf” or “supplementary information” (which most people would automatically do), leads directly to the paper and supplementary materials, and does not link to the rebuttal. Journals need to present default options (or “nudges”) that increase the probability that the user will view contradictory papers. For example, at the same link level as the article download, there should be links pointing to all rebuttals and responses. Even better, rebuttals and responses published in the same journal could be appended to the original pdf and the user offered a choice to download “original article only” or “original, rebuttals and responses”.

Another suggestion would be to create a website listing original papers and any rebuttals to those papers, so that editors and reviewers could check citations to determine whether one or more rebuttals should be cited. Reviewers' comments, if contradictory to portions of a paper, could be published as notes along with the original paper. Ideally, of course, contentious papers would be weeded out during the review process itself, by seeking additional reviews when the claims in the paper are particularly startling or when one or more of the initial reviews highlight potential flaws.

The results of this study have implications not only for the correctness of science in general, but also important practical implications for fisheries policy. Research findings may be used directly by policy makers to justify particular decisions. High‐profile articles such as those discussed here receive wide public attention outside the biological research community; they form the basis for headlines and sound bites, and help to shape public opinion on issues such as marine conservation, and voters in turn influence the decisions of policy‐makers. Thus high‐profile research findings have a compounded impact, making it even more crucial that public policy is based on balanced science reflecting all viewpoints, and not just on the science as it is first reported. As a poignant example of this distortion, we point to a 11 July 2010 headline in the prestigious London newspaper, The Sunday Times, trumpeting “Fish stocks eaten to extinction by 2050” (Leake 2010), based on a highly contentious projection in Worm et al. (2006). Not only does the article get the year wrong (2048 not 2050) and fail to mention any of the 11 rebuttals that question this projection, but it misses the later consensus paper by the same author and many of his critics that reverses the earlier projection of collapse, and instead expects rebuilding to occur in 5 of 10 well studied ecosystems (Worm et al. 2009).