A study released on February 6th 2018 by the University of Oxford as part of the Computational Propaganda Research Project was widely used as the source for news articles with headlines like these:

Many of the claims in these headlines could very well be true (we're not making any judgement about that here) but they are not actually reported or proven by the study in question. The full study, titled "Polarization, Partisanship and Junk News Consumption over Social Media in the US", can be downloaded here as a PDF file and you can find the abstract here.

The abstract reads:

What kinds of social media users read junk news? We examine the distribution of the most significant sources of junk news in the three months before President Donald Trump's first State of the Union Address. Drawing on a list of sources that consistently publish political news and information that is extremist, sensationalist, conspiratorial, masked commentary, fake news and other forms of junk news, we find that the distribution of such content is unevenly spread across the ideological spectrum. We demonstrate that (1) on Twitter, a network of Trump supporters shares the widest range of known junk news sources and circulates more junk news than all the other groups put together; (2) on Facebook, extreme hard right pages--distinct from Republican pages--share the widest range of known junk news sources and circulate more junk news than all the other audiences put together; (3) on average, the audiences for junk news on Twitter share a wider range of known junk news sources than audiences on Facebook's public pages.

It is immediately clear the abstract or even the full study doesn't mention "belief" or "believing" at all. So right from the start the Newsweek headline is making claims about the study that are not supported by the facts. All the study does is look at who is sharing and posting certain types of content.

The study also consistently speaks about "junk news" throughout, not "fake news". So all the headlines making generalisations about "fake news" based on the study are definitely misleading their readers.

Looking more closely at the study it seems to be using a very broad definition and even some circular reasoning when defining exactly what a source of junk news is:

Sources of junk news deliberately publish misleading, deceptive or incorrect information purporting to be real news about politics, economics or culture. This content includes various forms of extremist, sensationalist, conspiratorial, masked commentary, fake news and other forms of junk news.

So "sources of junk news" also publish ... "other forms of junk news"? And "fake news" is included as a part of the "junk news" here (without the study specifying how big of a share it makes up or if it is possible for something to be "fake news" without also being "junk news" or vice versa). So all the headlines broadly drawing conclusions about "fake news" based on a study explicitly done on "junk news" seem to be making an unsafe generalization here.

Which is kind of ironic given that the study goes on to explain the five factors they considered when labeling a source as "junk news": one of the factors is the use of "unsafe generalizations":

For a source to be labeled as junk news it must fall in at least three of the following five domains: • Professionalism: These outlets do not employ the standards and best practices of professional journalism. They refrain from providing clear information about real authors, editors, publishers and owners. They lack transparency, accountability, and do not publish corrections on debunked information.

• Style: These outlets use emotionally driven language with emotive expressions, hyperbole, ad hominem attacks, misleading headlines, excessive capitalization, unsafe generalizations and fallacies, moving images, graphic pictures and mobilizing memes.

• Credibility: These outlets rely on false information and conspiracy theories, which they often employ strategically. They report without consulting multiple sources and do not employ fact-checking methods. Their sources are often untrustworthy and their standards of news production lack credibility.

• Bias: Reporting in these outlets is highly biased and ideologically skewed, which is otherwise described as hyper-partisan reporting. These outlets frequently present opinion and commentary essays as news.

• Counterfeit: These outlets mimic professional news media. They counterfeit fonts, branding and stylistic content strategies. Commentary and junk content is stylistically disguised as news with references to news agencies, and credible sources, and headlines written in a news tone, with bylines, date, time and location stamps.

(Oh, and Newsweek, the same goes for "excessive capitalization", that's strike two...)

So which sources did the study consider to be "junk news" exactly, and how did they arrive at the list?

For this study, a seed of known propaganda websites across the political spectrum was used, drawing from a sample of 22,117,221 tweets collected during the US election, between November 1-11, 2016.

Links found in tweets from an eleven day period in 2016 were apparently used and then coded. It is not clear how many links they considered exactly and what the selection criteria were (we emailed the authors for clarification) but then:

Sources of junk news were evaluated and reevaluated in a rigorously iterative coding process. A team of 12 trained coders, familiar with the US political and media landscape, labeled sources of news and information based on a grounded typology. The Krippendorff's alpha value for inter-coder reliability among three executive coders, who developed the grounded typology, was 0.805. The 91 sources of political news and information, which we identified over the course of several years of research and monitoring, produce content that includes various forms of propaganda and ideologically extreme, hyper-partisan, and conspiratorial political information.

Out of all of this data they distilled 91 sources. And the paragraph seems to imply they had been looking at them already before the study since it is impossible to fit "several years of research and monitoring" between November 2016 when the data was collected and February 6th 2018 when the study was published. Also note that the definition here speaks about "content that includes...", meaning there could also be content in these sources that is not actually "various forms of propaganda and ideologically extreme, hyper-partisan, and conspiratorial political information".

Which seems to explain why relatively mainstream news websites like drudgereport.com and nydailynews.com find themselves on the list, just like pastebin.com (a site that allows people to quickly paste a snippet of text for easy sharing) which was apparently included because somebody pasted a conspiracy theory article to it.

The full list of sites is available in a downloadable supplement to the study, along with an Excel document that contains the "seed list" of sites and (in Sheet2) what looks like some information about the judgements assigned to the sites by the coders. Unfortunately there is no explanation about the various codes used in the Excel spreadsheet, we have reached out to the authors of the study for more information.

Bizarrely pollster Rasmussen Reports (rasmussenreports.com) is also listed, and the sample "junk news" link provided for it seems to be an article they published reporting that Hillary Clinton was edging ahead in the final polling before the election. The sample "junk news" article for the New York Daily News appears to be a news article titled "California woman accusing Donald Trump of raping her when she was 13 cancels press conference amid threats" (something The Guardian also reported).

The sample link for the Drudge Report is simply a link to the main page of the website (drudgereport.com) which constantly changes during the day and which usually links out to dozens of other news websites and blogs and which seldomly publishes original articles. It is unclear if the study considers any site linked to by the Drudge Report to be "junk news" for the purpose of the study or if it just considers the main page of the Drudge Report to be that. We have reached out to the authors for clarification.

The list also includes several right leaning weblogs and opinion websites like hannity.com, hotair.com, redstate.com, nationalreview.com, thegatewaypundit.com and newsbusters.org which already publised a story claiming the study "smears conservative news sites" in response.

We are not making any judgement on that claim, but stating "Trump supporters", "right wingers" or "the hard right" definitely "share and believe" more "fake news" based on this very limited study looking at who is sharing and posting content from a questionable list of just 91 websites, that seems like junk news to us.

(Even though it might be true: again, we are not making a judgement on that, we're just saying this particular study seems to be wildly misinterpreted by many news outlets.)