We are living through a time in which our patterns of news consumption are changing radically. We receive information almost constantly, and some of it is maliciously tailored to manipulate our prejudices and our unconscious biases. At Storyful, analysing misinformation and disinformation has increasingly become central to our work. It’s often a point at which data meets journalism, a point at which our reporters and tech teams collaborate.

At last year’s WebSummit, I attended a talk moderated by Financial Times global editor Matthew Garrahan, and titled “Can we halt the rise of fake news?”

What piqued my interest in that discussion, as an engineer, was the idea that dealing with misinformation was something that concerns journalists, developers, and engineers collectively. David Pemsel, CEO of the Guardian Media Group, said he did not believe it would be difficult to find “a technical solution” to determine “what is good versus what is bad.”

Perhaps he’s right, but our experience at Storyful has shown just how tangled the misinformation web is, and of the need for a collaborative approach.

Can we halt the rise of misinformation?

The human aspect of the work we do day-to-day at Storyful has never been more critical. Even with a suite of purpose-built tools, it takes human skill to probe, scrutinize, and identify the narrative around an event and its impact. Particularly when a disinformation campaign originates on the social web.

While aspects of verification can be programmatically automated and made easier to understand and dissect for journalists and analysts, we have to ask: how can we codify truth? At a surface level it’s easy to screen for veracity on the basis that an article originates in an established, reputable newspaper, or not. But misinformation often bubbles up from small, obscure platforms, or deliberately deceptive sources, and the solutions are constantly playing catch-up as falsifiers adapt and devise new strategies to deceive.

At Storyful we’re focussed on addressing this problem. As part of this we’ve built proprietary tools to analyse interactions on some of the most notorious playgrounds for misinformation, including 4chan, 8chan and Endchan.

How we distinguish between the good, the bad, and the ugly

To take an example: in the run-up to the US midterms in November 2018, Storyful journalists became aware of a campaign to discourage male Democratic voters from showing up at the polls. It presented as the hashtag #NoMenMidterms on Twitter and analysis quickly confirmed its origins on the politics board of a fringe network, along with instructions on how the hashtag should be propagated. A completely manufactured “viral” event, designed to negatively impact the Democratic candidates in the election while presenting as an organic movement from within that constituency.

Given the anonymous nature of fringe social networks, it’s particularly difficult to identify the creators of a campaign such as this. Yet, in order to determine intent to misinform there is a need to discover the original piece of content which started it.

The image above shows Cosmos, our network-analysis tool, mapping conversations on social media around the #NoMenMidterms hashtag. The application creates a 3D visual plot of interactions, and attaches a probability score to each node so that automated actors and bots can be easily filtered out or in, where necessary. This greatly reduces the complexity of the data and allows a team of journalists to follow the evolution of the event graphically, pursuing it to its origins by eliminating automated activity until key actors and sources can be identified. In this instance the campaign originated in a thread on the politics board of anonymous fringe network 4chan.

As the image above shows, participants in the campaign were furnished not just with a hashtag, but with imagery and guidelines too. With the help of tools like Cosmos, Storyful’s journalists can identify and reverse engineer a misinformation campaign such as this with greater speed.

A Joint Approach

We know that the problems of misinformation and disinformation cannot be solved by a technological solution alone, however much we can automate. We can use technology to attach a probability of truth to established and trusted sources, but for everything else context is simply too important to ignore. That means that while we can filter and simplify the data, use visualisations and pattern recognition to make it intuitive and easier to analyse, it’s much harder to discern context, hyperbole and sarcasm from deliberate falsehood.

In other words, as long as the production of disinformation involves human beings, the solution to combat it will too.