Measuring the impact of the media The active participation of the people is one of the central components of a functioning democracy. King et al. performed a real-world randomized experiment in the United States to understand the causal effect of news stories on increasing public discussion of a specific topic (see the Policy Forum by Gentzkow). Social media posts increased by almost 20% the first day after the publication of news stories on a wide range of topics. Furthermore, the posts were relatively evenly distributed across political affiliation, gender, and region of the United States. Science, this issue p. 776; see also p. 726

Abstract We demonstrate that exposure to the news media causes Americans to take public stands on specific issues, join national policy conversations, and express themselves publicly—all key components of democratic politics—more often than they would otherwise. After recruiting 48 mostly small media outlets, we chose groups of these outlets to write and publish articles on subjects we approved, on dates we randomly assigned. We estimated the causal effect on proximal measures, such as website pageviews and Twitter discussion of the articles’ specific subjects, and distal ones, such as national Twitter conversation in broad policy areas. Our intervention increased discussion in each broad policy area by ~62.7% (relative to a day’s volume), accounting for 13,166 additional posts over the treatment week, with similar effects across population subgroups.

The fields of political communication in general and media effects in particular are broad, deep, methodologically sophisticated, and central to social science. They have covered persuasion (1), agenda setting (2, 3), attitude formation (4), diffusion, gatekeeping (5), priming and agenda setting (6), issue framing (7), and numerous other topics, and are built on a wide range of intellectual traditions [(8), p. 174].

We focus here on an aspect of political communication with special relevance to the study of representative democracy: how the news media activate public expression, causing citizens to discuss major issues of policy and politics as part of the ongoing, collective “national conversation.” A well-functioning democracy larger than the sum of individual attitudes and behaviors requires public discussion and engagement among citizens on major issues of the day (9–11). Indeed, “political participation is not merely about trying to influence policy but also about trying to induce others to participate and give voice” (12). Although governments may easily dismiss any individual’s opinion, collective public expression has a powerful impact on the behavior of government officials and the public policies they promulgate. The power of collective expression is a central feature of both representative democracy—where “the more the people are aware of each other’s opinions, the stronger the incentive for those who govern to take those opinions into account” (13)—and autocracy (14, 15). Citizens may join this national conversation to deliberate (16), or simply “to give testimony” in the presence of others (17).

We thus study the effects of the media on the classical notion of expressed public opinion, a concept predating modern survey research, and with a focus not on changes in individual behavior or attitudes but instead on the content of the national conversation (18, 19). In the past, this discussion could only be measured by collecting “water-cooler events” (20), listening to hallway and dinner conversations, reading newspaper editorials and political leaflets, or listening to soapbox speeches from public squares. Today, we can take advantage of the fact that much of the conversation has moved to, and is recorded in, the 750 million social media posts that appear publicly on the web every day.

Unfortunately, estimating the effect of the news media is extremely challenging [(21), p. 267]. Scholarship dating back more than a century has had to contend with severe endogeneity because media outlets are businesses competing for readers, catering to their interests. Large-scale randomization of news content is normally impossible because of high costs, logistical infeasibility, and even some ongoing miscommunication between the journalistic and scientific communities regarding the norms of the former and the goals of the latter. Even if randomization is possible, avoiding spillover effects is difficult because any media intervention can affect all potential research subjects in the nation simultaneously. The result is often “profound” biases in estimated effects, with a greater than 600% difference from the truth (22, 23) given common levels of endogeneity, measurement error, and self-selection [see also (24)]. These biases have been addressed in some of social science’s most creative observational studies, although these approaches are well suited to answering certain questions (such as those for which instruments are available) but not others [e.g., (25–32)]. The biases are also addressed via elegant experiments and quasi-experiments, often made possible by studying different quantities of interest, such as individual-level effects or occasionally the effects on aspects of the national conversation (26, 33–42).

We attempted to tackle these methodological issues directly by enlisting a large number of news media outlets that allowed us to run an unusual set of experiments. We developed and implemented an “incentive-compatible” research design that enables both full randomized experimental control in the hands of the researchers, so we could accomplish our scientific goals, and full editorial control in the hands of the journalists, fitting into their familiar customs and practices, so they could participate. Forty-eight mostly small news media outlets participated in our research [The Progressive was near the median size (43)]. Seventeen of these outlets were part of our preliminary trial run experiments, provided information, and were helpful in other ways, and 33 were part of the experimental protocol we now describe (of which two participated in both stages on unrelated stories). In addition, more than a dozen others provided information, advice, or proprietary data but were not part of our randomized interventions.

Our work was aided by journalists’ natural interest in understanding the impact of their work. However, they are also competitors, trying to scoop each other. The difficulty is compounded by the fact that we asked these professionals to take actions few journalists have ever before agreed to, to allow researchers to participate in ways that rarely happen, and to share proprietary information with us that they do not even share with each other. We also needed to secure numerous individual agreements and arrange large-scale coordination among competing entities over nearly 5 years. As such, much of our effort involved building relationships, trust, and common understanding. We designed our experimental protocol to ensure that both our scientific goals and the journalists’ professional goals were maximized.

An industry association (The Media Consortium, representing some of our outlets) helped us coordinate with the outlets and received funding to offer small financial incentives to some outlets, following their usual funding procedures. Our research team also received some direct funding from the same source. To protect the journalistic integrity of the numerous professionals who participated in our experiments, and the reputation of their publications, we do not reveal the specific articles in our experiment, which outlet published each article, or any potentially identifiable individual-level aspects of the data we collected. We retained full rights to scholarly publication, without any required review or preapproval. To maintain a high level of realism, we tried to ensure that media outlets followed their standard operating procedures, embedding our treatment within their ordinary routines. The resulting protocol made our design more expensive, logistically complicated, and time-consuming, but it should be more generalizable and compatible with the goals and norms of both the journalistic and scientific communities.

We ran a series of experiments, each ultimately constituting a single observation. Our treatment protocol for each had five parts. First, we chose a broad policy area from a set of 11 areas of both major national importance and sufficient interest to our news media outlets: race, immigration, jobs, abortion, climate, food policy, water, education policy, refugees, domestic energy production, and reproductive rights (43). Combining all 11 policy areas together, rather than using only one, greatly expands the representativeness of our study at the potential cost of needing more observations.

Second, we chose a set of news media outlets and induced content correlations across them in ways that mirror common practices. Sometimes referred to as “pack journalism,” these practices include writing stories on the same subjects, “piling on” immediately after a story is broken by one outlet, occasionally collaborating, and sometimes even co-authoring stories. Although this behavior is sometimes criticized, professional journalists follow these venerable practices to help get stories out and ensure that they reach a wide variety of differentiated audiences. We simulated the effects of pack journalism by following a procedure occasionally used by outlets to collaborate before publication, under negotiated ground rules. By using a “project manager” design, a group of outlets agree to collaborate on a specific story for a limited time. Participating outlets share information and publish simultaneously, often with assistance of the outlet hosting the project manager. They may offer staff, information, visualizations, or promotional materials. These fiercely independent sites even agree to effectively delegate aspects of editorial control to the project manager because, in addition to increasing their collective impact, each site retains the ability to opt out if necessary. This mechanism allows the project manager full editorial control over what is included in the collaboration, but gives individual outlets full control over what is published. A prominent recent example is the Pulitzer Prize–winning “Panama Papers” investigation (see bit.ly/kppapers and j.mp/ppapers). Playing the role of a project manager, without being based at one of the outlets, had the added advantage of making it easier for the outlets to share information with us that they would not normally share with each other.

We thus intervened for each experiment with what we refer to as a “pack” of two to five outlets (with a mean of 3.1 across all our experiments) rather than one. To ensure that outlets had experience in a chosen policy area, as well as sufficient enthusiasm for the subject matter and their collaborators, we allowed outlets to volunteer to join a pack for each experiment. We then asked them to collaborate as they would normally under this familiar structure. We retained approval rights to the collaboration to satisfy our scientific goals, and journalists and editors retained the right to opt out (before randomization) to satisfy journalistic standards; good communication kept either from exercising these rights in practice.

Third, while we controlled the collaboration as the project manager usually does, we gave the journalists the discretion they normally have. To do this, we asked the pack to select a specific subject for articles its members wrote within our chosen policy area (planning for each outlet in a pack to write one article). For example, if the broad area was technology policy, the specific subject of the articles might be what Uber drivers think about allowing driverless cars, or how a new trade agreement affects hiring at local technology firms in Philadelphia. The articles could be large-scale investigations, interview-based journalism, opinion pieces, or any other variety normally published by pack members. The journalists and their outlets naturally sought to publish newsworthy articles as well as subjects that would remain of public interest whenever our random assignment mechanism (43) determined they would run. This ruled out stories based on breaking news. We retained the right to reject a subject if the pack’s choice was outside our policy area, or to reject any individual article by an outlet in a pack; the outlets retained the right to publish whatever they wished outside of our experiment. As above, good communication kept each to a minimum.

Fourth, we implemented a matched-pair randomized experimental design, which has considerably more statistical power, robustness, and efficiency than classical randomized designs (43, 44). Our unit of treatment was the entire nation during an experiment-week, with the treatment being a set of articles published by a pack of outlets on the publication day (usually Tuesday) of a week of our choice; this enabled us to avoid spillover effects or model-dependent inferences. We selected a pair of consecutive weeks matched for similarity of predicted news content (43). Then we randomly assigned one week to be the “treatment” week, during which pack members ran their stories, and one to be the “control week” when they were asked to behave as usual (43).

Each outlet then distributed its content as it usually would, via its website, print media, video reports, audio podcasts, etc. As is typical of all modern news media, each outlet also promoted its content with advertising via social media, Google adwords, email lists, and search engine optimization techniques, among others; outlets also often co-promoted with others in the same pack. As far as we could tell, they followed the same practices for articles in our experiment as those they ran ordinarily. We also went to great lengths to ensure that the policy areas, subjects, and articles we chose appeared indistinguishable from the normal type and flow of articles appearing in these outlets in the course of their ordinary business practices [this turned out to be the case (43)]. To the best of our knowledge, no outlet received any reader communications about an article or practice that seemed unusual or out of place.

Finally, we avoided intervening in any one outlet so often as to get in the way of its normal practices, change the character of the publication, or be discovered by readers. This is why we needed to organize a large pool of outlets from which we could choose different packs for each observation, rather than using only one small pack of two to five outlets repeatedly. This procedure adds causal heterogeneity and thus requires a larger n overall, but should generate a more representative causal effect.

Because the cost of collecting each observation in our design corresponds to an entire experiment in most designs, we followed two additional procedures to reduce costs: (i) We ensured that we collected only as much data as necessary by inverting the usual approach to statistical inference via sequential hypothesis testing, including a nonparametric sequential technique specially developed for this research (43); and (ii) we evaluated multiple observable implications of our intervention, rather than only one. Thus, Fig. 1 portrays points we could measure on the causal pathway from the treatment intervention (far left) to our ultimate outcome variable of interest (far right). The first link is the causal effect of the treatment intervention on the number of articles published. If we found that instructing sites to publish articles in a given week had no effect, we would know to be skeptical of an intent-to-treat effect on social media posts. This is not a deterministic step, because unexpected events can cause media outlets to publish on a chosen subject more than expected in either of the two weeks. Media outlets, as ongoing competitive businesses, may sometimes be forced to respond to unexpected events in ways that violate an experimental protocol. Fortunately, the randomized assignment in our design prevents such “noncompliance” from inducing bias in the intent-to-treat effect, although it could introduce heterogeneity and smaller effects overall, both of which would lead us to need a larger n given a chosen level of uncertainty.

Fig. 1 The causal path from randomized treatment (first point) to public expression on broad policy areas (last point).

The next arrow in this causal pathway connects articles published to numbers of website pageviews for the articles we commissioned and any others in the same policy area. The second arrow in Fig. 1 then refers to a causal effect, which is positive only if more people visit pages with articles in the policy area during treatment weeks than during control weeks. In our design, the only plausible way for either our treatment or the publication of news articles by media outlets to have an effect on either measure of public expression of opinion is through at least some people reading the articles, usually on the outlets’ web pages. We portray this in the figure by the absence of paths, other than through outlet website pageviews, from the randomized treatment or published articles to expression in broad policy areas in social media. However, pageviews can cause social media participants to express themselves publicly on broad national policy issues either as a direct result (curved arrow in Fig. 1) or as a result of reading social media posts written narrowly about the subject of the published articles (arrows to and from “posts on subject”).

One benefit of our years of negotiations turned out to be high experimental compliance, with 3.1 media outlets in each pack and 2.94 additional articles published as a result of our interventions, which took place between October 2014 and March 2016. Our sequential analysis stopping rules resulted in 35 experiments and thus n = 70 observations. We discuss detailed uncertainty analyses in (43), all on the scale of false positive rates. Here, we present causal estimates on the scale of our outcome variables and quantities of interest for two sets of results, each using model-based and model-free estimation.

Figure 2 reports estimates of the main quantity of interest in our experiment: the average causal effect of a pack of journalists publishing articles, at a time we randomly determine, on the extent to which Americans express themselves publicly in the national conversation on social media within a broad policy area of our choice. The causal effect for each day in terms of a percentage change in Twitter posts (Fig. 2A) and the corresponding absolute numbers of posts (Fig. 2B) were estimated for each day following the intervention and the total effect (the horizontal axis). We do this with our model-based estimator (red dots; solid square for total) and with our model-free estimator (open circles; open square for total).

Fig. 2 Causal effect of the news media on public expression. (A and B) Effects are shown in terms of percent change (A) and absolute change (B) in numbers of social media posts in a broad policy area. Effects appear as the percent change in social media posts for each day of the week—estimated by our model-based estimator (solid red dots) and our model-free estimator (open circles)—and the total overall (solid and open squares, respectively).

The figure shows that our experimental treatment causes the number of social media posts appearing in a broad national policy area discussion to increase by 19.4% on the first day after publication, according to our model-based estimator (Fig. 2A, leftmost red dot). From the red dot in the same position in Fig. 2B, we can see that social media users wrote and published on average 4442 additional posts solely as a result of our intervention. Moreover, the same articles continued to have effects over the rest of the week. On average, these effects declined with distance from publication day, with approximately zero effect on average by day 6 [consistent with (45, 46); see also (47)]. The total effect (Fig. 2, A and B, solid square at top right) indicates that our experimental intervention overall caused a 62.7% increase in social media posts over the week relative to the average day’s volume (or 10.4% relative to the entire week), which on average in a policy area accounts for Americans writing a total of 13,166 additional social media posts solely because of our intervention. The estimates from our model-free approach (Fig. 2, A and B, open circles and open square) offer the advantage of avoiding modeling assumptions but have the resulting disadvantage of higher variance. Yet they clearly convey the same overall pattern in causal effects. [We present detailed uncertainty estimates in (43).] In addition, given the reasonable hypothesis that the causal effect varies smoothly over days of the week, the degree to which the model-free estimates (the circles) vary around the model-based results (the line) provides another estimate of the uncertainty of our primary causal effects. As can be seen from this perspective, these estimates have relatively low levels of spread (or uncertainty) around them and are clearly above zero.

In Fig. 3, we estimate the effect of our intervention on different subgroups expressing themselves in a broad policy area. The subgroups we were able to define include political party (Democrats, Republicans, unknown), gender (men, women), region (Northeast, Midwest, West, South), and degree of influence on Twitter (high and low). [The party, gender, and region of social media posts are based on Twitter metadata, supplemented by analyses of Twitter bios and follower structures; influence is based on numbers of followers and the social graph (43).] As a reference, we add to each graph a red line representing all posts (taken from Fig. 2), but we omit the model-free estimates for graphical clarity. The interesting result from this analysis is the lack of a result: The difference between any pair of subgroups within a panel is always small (and never statistically distinguishable from zero). Apparently, the national conversation really is one conversation, at least among those able to participate in social media; even if they do not interact with each other, the evidence indicates that they are being influenced in similar ways by the news media.

Fig. 3 Causal effect of the news media on the percent change in social media posts by political party, gender, region, and influence on Twitter. Axes are defined as in Fig. 2A.

The outcome variable in Fig. 2 is based on the total number of posts in a broad policy area, and is designed to measure the national conversation and how it is affected by our randomized treatment. We present another observable implication of media effects in Fig. 4, counting only the daily number of unique authors of posts rather than the total number of posts. This figure demonstrates that more Americans were engaged by the articles in this policy area (rather than the same people posting more). The causal effect of our intervention on the first day was an increase of 23.9% in the number of unique authors (accounting for 3287 more individuals) participating in the national conversation in a broad policy area; this effect did not drop to zero until the fifth day. This result also makes bots less likely to account for our results (43, 48).

Fig. 4 Causal effect of randomized treatment on the number of unique authors expressing themselves in the same policy area as the intervention. Effects are shown in terms of percent change (left) and absolute numbers of posts (right) for each day (red dot) and total overall (black square).

Our news media intervention also changed the composition of opinion expressed in the national conversation by 2.3 percentage points in the ideological direction conveyed by our published articles; individuals may or not have been persuaded to change their views, but the overall testimony given publicly changed noticeably (43). Effects on other observable implications, including effects on website pageviews and on discussion on the specific subject of the articles, are described in (43). Overall, our experiments revealed large news media effects on the content of the national conversation across 11 important areas of public policy, political party, gender, region, and level of social influence. Positive media effects have long been suspected in the literature, but the large size of these effects approximates even some of the long-standing speculations (and accusations) of media critics.

We place these effect sizes in context and then discuss their implications. First, the subjects of the articles in our treatments are limited to those that journalists are willing to write about, and their outlets are willing to publish, at randomly determined times, days or weeks after they were conceived. Additionally, searching for weeks to constitute good matched pairs, in the service of reducing bias and inefficiency, typically led us to select news periods predicted to be relatively “quiet” [predictions that turned out to be relatively accurate (43)]. The media effects during other weeks, such as when outlets publish stories to ride a viral social media wave or to satisfy the intense interest of a major breaking story, may of course have effect sizes different from those we reported. The effect sizes and baseline volumes for our study are small relative to huge entertainment events (e.g., they are about one-hundredth the size of the Twitter frenzy generated by a new episode of the television series Scandal; j.mp/SCandal). Still, they represent important and substantial increases in national policy discussions on important issues, and they indicate that the media are causing many more people to express themselves publicly (and more frequently) on such issues than would otherwise be the case.

The intervention we studied was the result of only two to five small- to medium-sized outlets. To glean what our effects might have been if we had recruited larger outlets, we conducted informal observational analyses where randomization or a large n was infeasible. We searched unanticipated New York Times stories on topics where Times reporters scooped other outlets or reported on surprise events during periods with few other stories. For example, we found a news story about a previously embargoed scholarly article about fracking affecting drinking water, at a time when little else in the policy area was being discussed (j.mp/frackH2O). We observed a 1-day spike in discussion in the broad policy area of water quality and related issues of more than 300% (versus a 19% effect size in our study). Numerous public policy issues have far higher visibility than fracking, many with far more impactful “interventions.” Although further research is needed to confirm this large effect, it appears that some articles published may have a multiple of the already large effect size we found.

Our results should remind us of the importance of the ongoing and interconnected national conversation Americans have around major issues of public policy. This conversation is a fundamental characteristic of modern large-scale government, the content of which has important implications for the behavior of officeholders and public policies. We also find—among those who participate in social media—that the effects of the news media are approximately the same across citizens of different political parties, genders, regions, and influence in social media, further supporting the idea that the conversation is truly national. Given the tremendous power of media outlets to set the agenda for public discussion, the ideological and policy perspectives of those who own media outlets have considerable importance for the nature of American democracy and public policy. The ideological balance across the news media ecosystem, among the owners of media outlets, needs considerable attention as well (49). The ability of the media to powerfully influence our national conversation also suggests profound implications for future research on “fake news” potentially having similar effect sizes (50) or “filter bubbles” potentially reducing or directing these effects (51).

Social scientists have long been interested in estimating the impact of the news media on how Americans participate in the national conversation on important public policy issues, but other important issues, such as media effects on individual attitude formation and persuasion, also need to be subjected to randomized experiments. Similarly, further research is needed in areas beyond the 11 policy areas we studied. Studies should also be conducted with outcome variables beyond our specific measures, beyond social media, and with media outlets with different ideological perspectives. Finally, although we have been able to estimate the causal effect of some of the news media, we have not measured how often actual media outlets make efforts to move different populations of Americans to express themselves about specific policy areas. We encourage future researchers to take up these measurement challenges and the numerous other topics that may shed light on the formation, development, and changes in the effect of the media on citizen engagement in the national conversation.

Supplementary Materials www.sciencemag.org/content/358/6364/776/suppl/DC1 Materials and Methods Figs. S1 to S7 Table S1 References (53–77)