Data collection and analysis has changed drastically over the last decade, but for predicting political violence, data science is also about looking to more traditional methods of collecting data.

Traditional practices involve descriptive and exploratory methods, often interviewing sources in person, reading trustworthy news sources and conducting research with the help of organizations.

Modern data analytics; however, collects information from various sources that range from machine logs, social media accounts and various online media outlets, and uses Artificial Intelligence to generate automated data sets.

These automated data sets, according to professor Clionadh Raleigh, Executive Director of the Armed Conflict Location and Event Data Project (ACLED), can be problematic.

‘Big data is nonsense’

“They are just media-counting apparatus,” Raleigh told The Sociable, explaining that they can be particularly untrustworthy and misleading for researchers seeking information about countries where media coverage is not carried out in English.

“I believe that big data is nonsense”

In the case of Latin America, for instance, sources from which information in Spanish has been taken and translated into English, become unreliable, Raleigh argues.

“There’s such a tidal wave of nonsense that we are now inundated with,” Raleigh said, explaining why she believes English language media is now a particularly poor way to find out information about Spanish or Portuguese speaking countries.

For Raleigh, big tech companies such as Facebook and Google — which circulate the majority of this information and have been a significant part of the data revolution — are partly responsible for conducting “poor data analysis.”

“I believe that big data is nonsense,” she claimed. “It’s just this trolling — it’s incredibly noisy and incorrect, and messy and geographically illiterate.”

‘Triangulating’ the data

ACLED’s methodology, which uses a three-tier system to conduct research, carries out researcher-led data coding. This data is then inputted into data prediction models which can foresee whether trends in political violence and protest will either increase or remain unchanged.

The project classifies political violence as violent actions perpetrated to achieve political goals between either two armed groups, armed groups and the state or armed groups and civilians.

“When you’re talking about a human phenomenon… you should use human sources, because it’s a lot more insightful than anything else”

ACLED obtains data from local partnerships with organizations such as universities and conflict observatories, from local, on-the-ground sources — who speak the relevant native language — and new “verified” media, which must pass through the company’s bias regulator. This data is then coded, in English, into the ACLED system.

“We have to make sure that we can build a jigsaw of sourcing so that the data is thorough and reliable,” Raleigh said.

According to the data collection, analysis, and crisis mapping project ACLED conducted, there were over 40,000 political violence and disorder events in Latin America in 2019: much more than the team had expected.

Human sources for human phenomena

However, Raleigh insists that the methodology her team at ACLED used to calculate this data was the only way they could have gathered it.

“When you’re talking about a human phenomenon, for example, protest or revolution, you should use human sources, because it’s a lot more insightful than anything else,” she said.

Although rates of political violence are high in Latin America, they are consistently high, Raleigh explained, as opposed to other parts of the world, where violence can be more seasonal.

The nature of Latin America, therefore, means that trends in political violence are especially difficult to predict in the long-term.

“If you’re predicting something, it has to be within the bounds of what we can do about it”

For this reason, ACLED focuses more of its attention on the actions of a few actors driving violence in Latin America — who have in some cases taken over control of certain areas of land from the state — and strives to find out what is predictable about their behavior.

This is why Raleigh believes that short-term predictions of conflict trends are more useful than long-term ones.

“If you’re predicting something, it has to be within the bounds of what we can do about it,” she said.

And these evidence-based assessments are essential for journalists, governments, and militaries alike.

“Without rigorously, systematically collected information that people can stand behind and feel is reliable, a lot of discussions come down to hearsay,” said Raleigh, underlying the importance of obtaining it accurately in 2020.