Overview

In collaboration with nine UK press offices, we ran a randomised controlled trial in which the ‘participants’ were press releases (N = 312) distributed to international media outlets over a 20-month period from September 2016 to May 2017. To operationalise evidence strength, we concentrated on the basic distinction between correlational and experimental types of evidence, a keystone for assessing the ability to support causal conclusions [26].

The collaborating press offices sent their biomedical and health-related press releases to us just prior to release. We randomly allocated each press release to receive one, both or neither of two interventions. The first intervention was causal claim alignment. We made suggestions to align the headline and prominent claims with the evidence, such that direct causal claims were only made for experimental evidence, while correlational data carried cautious claims, using words such as might and may. The second intervention was a causality statement/caveat. We inserted an explicit statement about whether the evidence could support a causal conclusion (e.g. this was an observational study, which does not allow us to conclude that drinking wine caused the increased cancer risk).

The press office was then free to accept, edit or reject the proposals (sometimes in consultation with academics according to their normal procedures) and issued the release as normal. We searched for arising news (print, online and broadcast; total N = 2257), and its content was double-coded by two researchers blind to condition and press release content. The protocol was pre-registered (https://doi.org/10.1186/ISRCTN10492618, 20/08/2015) and approved by the Research Ethics Committee at the School of Psychology, Cardiff University. We do not name press offices to avoid identifying individuals. All data are available online at https://osf.io/apc6d/

Participants: press releases

The ‘participants’ in the trial were press releases. For inclusion criteria, see Fig. 1.

Fig. 1 CONSORT diagram for the press releases (participants) in the trial. Inclusion criteria: participating press offices were asked to send each press release based on peer-reviewed research that was relevant to human health, broadly defined (all biomedical, psychological or lifestyle topics), where the press office was leading the press release (rather than collaborating on a release by another office outside the trial) and the academic authors consented (we used opt-out consent). Our focus was on observational and experimental studies. Observational studies included cross-sectional and longitudinal designs as well as meta-analyses and systematic reviews based solely on observational research. Experimental research included randomised controlled trials, other experiments and meta-analyses or systematic reviews based solely on experimental designs. Press releases on studies that could not be classified as experimental or observational (e.g. simulations and mixed methods reviews) were excluded Full size image

Sample size

We estimated we would achieve 300–500 press releases based on 100% coverage of eligible press releases from participating offices. In practice, some offices released fewer relevant press releases than expected and some eligible press releases were not sent to us for a variety of reasons (Fig. 1; 261 of 499 eligible press releases were sent; see reasons beyond the exclusion criteria of joint release and author consent). We therefore extended the trial duration and introduced a stopping rule of 75 press releases per bin (prior to exclusion of study designs not classifiable as experimental or correlational). Since we used pure randomisation, some bins were larger than others (Additional file 1: Table S2) and the total was 312 following study-design exclusion. Note that the power calculations in the protocol are only indications, since actual power depended on the clustering structure in the GEE analyses.

Randomisation and blinding

Randomisation was by independent random number generation for each press release received (and therefore allowed unequal cell sizes by chance) and occurred prior to any assessment of content (and therefore before exclusion of simulations and mixed-methods reviews which reduced some cells below 75; Table 1). We did not communicate the condition to the press office. There were three researchers coordinating the trial (RCA, AC and LB). For each batch of press releases, RCA or AC coordinated randomisation and interventions, while the other two would remain blind for double-coding the outcomes.

Table 1 Numbers of press releases in each intervention condition following all exclusions, and numbers of intervention suggestions made and adopted Full size table

Interventions

A. Causal claim alignment

The main causal claims in the headline and body of the press release were altered to align with the evidence underlying those claims. If claims were already aligned with the evidence, these were not modified. Based on previous results [27] showing which causal phrases readers distinguish or treat equivalently, all claims for observational evidence were modified to use hedged/cautious or associative language (may, could, might; e.g. ‘drinking wine may increase cancer risk’; associated, linked; e.g. ‘drinking wine is associated with increased cancer risk’) unless such language was already used. Claims for experimental evidence were modified to (or left as) direct causal statements (e.g. ‘drinking wine increases cancer risk’) or can cause statements (‘drinking wine can increase cancer risk’). In the registered protocol, we referred to alignment as accuracy (see Additional file 1: Figure S2).

B. Causality statement/caveat

Unless it already existed, a statement was inserted into the press release body to convey the design of the study and the strength of causal conclusions that could be justified from this design. For example, ‘this was an observational study, which does not allow us to conclude that drinking wine caused the increased cancer risk’ or ‘this study was a randomised controlled trial, which is one of the best ways for determining whether an intervention has a causal effect’ (in the registered protocol, we labelled this intervention study design statement; see Additional file 1: Figure S2). These statements were inserted at the earliest point where they fitted with the press release content. The majority were inserted into text, not into quotes, because feedback from press officers indicated that it was normally not pragmatic to get author approval for new quotes before release.

A. Causal claim alignment + causality statement

In this condition we suggested changes according to both A and B above, unless they were already present.

B. Control

The control condition was a suggested synonym change for a word that was not relevant to the main causal claims or study design (e.g. ‘beverage’ changed to ‘drink’).

Primary outcomes

1. News content

From each pre-intervention press release, a list of search terms was generated to search for print, online and broadcast news coverage from a pre-defined list of top-tier national and international news outlets (see Additional file 1: Figure S3). Searches were conducted using Nexis, Google and TV Eyes. News coverage was sourced for 1 week prior to the press release date (to cover date differences due to time zones and any breaches of embargo) and for 28 days following the release. Two researchers blind to condition and final press release content coded the news using a standard protocol abbreviated from Sumner et al. [25] to extract the content outcomes listed below. All discrepancies in coding were resolved so that the final concordance was 100%. See open data for the full coding sheet.

(a) Causal headline and claim alignment: We coded whether the news headline and news main claims were direct causal, can cause or hedged causal/associative. Alignment was defined relative to the study design of the peer-reviewed journal article. Following Adams et al. (2017), we grouped direct cause and can cause together as strong claims appropriate for experimental evidence, and we refer to hedged cause/associative statements as cautious claims appropriate for correlational evidence [27]. We coded and analysed headlines and main claims separately as they are normally written by different people (sub-editor and journalist); headlines are most prominent but the writers are one step further removed from the press release. We operationalised main claims as those made in the first two sentences beyond the headline (excluding context sentences not about the new study). We excluded news headlines or claims that were not causal/associative or made a claim of no cause (‘wine does not raise cancer risk’). We also excluded news claims that were about entirely different variables than the press release. (b) Causality statement/caveat: We coded whether a statement relating study design to cause-and-effect was present in news stories. We did not require that the news used scientific terms such as correlation or randomised controlled trial, but rather that the news contained a relevant statement about the possibility or difficulty of causal inference. For correlational evidence this had to be a caveat (e.g. ‘we don’t know if wine is directly responsible for cancer risk’ or ‘we cannot draw conclusions about cause and effect’).

2. News uptake

It is the proportion of press releases that attract news. Following Sumner et al. [20, 25], we simply scored news as present or absent, rather than discriminating between types of news and the differing media targets that some press releases may have. We also counted number of news stories (though this is an imperfect measure due to non-independence where some stories are copied across outlets; we present the results in Additional file 1: Figure S4). Although for news content we separated headlines and main claims, the outcome measure of uptake does not separate them. Therefore, we operationalised aligned press releases as follows: press releases for observational studies were aligned only if both headline and main claim used cautious language (and conversely, press releases for experimental studies were aligned if either the headline or the main claim used direct or can cause phrases).

Secondary outcomes

We also coded whether news contained exaggerated advice or exaggerated inference from non-human research. These outcomes do not correspond to our main interest here, but were included for comparison with previous research [20, 25]. Analysis and results are in Additional file 1: Figure S5.

Feasibility and acceptability

As a pre-requisite for interpreting the main news outcomes, and to assess whether alignment, caution and caveats are generally feasible and acceptable to integrate in press releases, we assessed the number of pre-intervention press releases that already contained them spontaneously, the number of suggestions made, accepted (including those edited while maintaining the distinction between cautious and strong), or rejected, and hence the numbers of our intended interventions present in the released versions of the press releases in each condition. Note that for our interest, spontaneous presence of appropriately cautious claims or caveats is more valuable than accepting our interventions, since intervention is not a feature of normal press release process. For this reason, we also assessed change between the trial and a baseline period of 2 years prior to the trial. To do this, we randomly sampled up to 20 press releases for each collaborating centre from 2014 and 2015 (10 from each year, or all eligible press releases from a centre if less than 10 were available), using the same eligibility criteria (except consent, as these press releases are in the public domain). We double-coded them in the same way as the press releases in the trial.

Analysis and statistical methods

We focus the analysis on the main effects of causal claim alignment and causality statements/caveats separately, as recommended by [28], because the 2 × 2 design was not powered for the interaction (we report interactions as secondary analyses [28]). Causal phrasing could be coded and analysed where the headline or main statement made a causal or associative claim (excluding those that made no claim about a health outcome, or made a claim of no cause, e.g. wine does not cause…). Presence or absence of causality statements/caveats could be assessed for all. For causal claim alignment, we also separated news headlines and main claims, as explained above.

For the primary outcome measures of news content and uptake, we used both intention-to-treat (ITT) and as-treated (AT) analytic approaches. ITT analysis maintained the randomisation, comparing news content and uptake in conditions that attempted to make interventions against those that did not regardless of whether a suggestion was possible or accepted, and what the final press releases actually contained. AT analysis, on the other hand, depended on the content of the finally released press releases. This corresponds directly to what the journalists actually saw, but it disregards the randomisation and is therefore an associative analysis subject to selection bias, for which causal inference is not directly possible. However, it becomes useful when there are high levels of treatment mixing within groups due to spontaneous presence in the control group or non-acceptance in the intervention group—both of which we anticipated here and which can render ITT difficult to interpret (and would also severely reduce N for a per-protocol analysis, which we did not perform).

To account for the clustering of news to press releases or press releases to press office, we used generalised estimating equations (GEE, using a binary logistic model with exchangeable correlation matrix) as in our previous work [20, 25]. Since our intervention suggestions depended on study design (observational vs experimental), we also tested interactions with study design (data plotted in Additional file 1: Figure S6).

To assess feasibility, we estimated usage rates of caution and caveats in both pre-intervention and final press releases and compared them to the baseline period, using GEE as above to compensate for the clustering of press releases to press office.