President Donald Trump, Facebook’s Mark Zuckerberg and the U.S. Justice Department agree fake news is a significant problem, but they don’t agree on what it is — and so far, there’s no reliable way for someone to know whether the story they’re about to read is fake.

UC Riverside has a technological approach to help.

The university’s Multi-Aspect Data Lab, led by Assistant Professor Evangelos E. Papalexakis, won the “best paper” award at the Misinformation and Misbehavior Mining on the Web workshop for their technique to sort real news from fake news.

Papalexakis, who goes by Vagelis, said he’s working with tech companies to potentially use a version of the algorithm on a large scale, although he’s a long way from potentially selling it.

There are significant limitations, he said.

“I feel that technology-driven solutions like this can at least point people to a direction that can help them figure out whether an article is fake or not,” Papalexakis said. “But ultimately, you need a public that is conscious.”

Papalexakis’ technique doesn’t actually verify facts. That’s a key determinant in the original definition of fake news — false information published to mislead people — before many people began using the term as a synonym for sensationalism, biased reporting or even stories they don’t like.

But when the UCR team fed the algorithm articles posted online from June to August of 2017 — 33,160 fake news and 33,047 real news — the technique could tell whether an article was real nearly 3 out of 4 times. In other words, the algorithm agreed with human assessors about whether an article was fake in 72 percent of cases, according to one of their papers.

And when looking only at fake news, the approach can determine with 80 percent accuracy what category it belongs in: satire, extreme bias, conspiracy theory, junk science, hate group or news published by a repressive government, they wrote in another paper.

The key insight that allows the technique to perform so well is that each type of news uses certain words close to other words in a way that other stories don’t, Papalexakis said. And it’s not words that would stand out to a reader — “Crooked” and “Hillary” or “pizza” and “pedophile.”

“It would not be immediately visible to the naked eye,” he said. “If an article has grammatical mistakes or typos, someone could find that without our method. The value we add is word patterns that you wouldn’t know to look for.”

One danger of such a method is that someone intentionally trying to spread false information could learn to mimic the writing style of professional journalists who are identified as being real news, said Jason Shepard, a journalism professor and chairman of the Communications Department at Cal State Fullerton.

“That’s where I get worried about putting too much faith in artificial intelligence tools,” said Shepard, who hadn’t read the paper. “Smart people will likely be able to design content to get around basic computer-generated decisions.”

On the other hand, the technique also could misidentify real news as fake — either by mistake or because of intentional manipulation, he said.

Shepard said it’s an important area to study, but overall he’s skeptical that any machine-based approach will correctly identify fake news all or most of the time.

Instead, he advises people reading the news to go to sources they trust — traditional news outlets with well-developed ethical standards and transparent sourcing — and to double-check what they read.

And beware of politically motivated attacks on real news, he said.

“I think there is a concerted effort today to delegitimize independent journalism,” he said. “The best journalism pisses people off who are in power. And so when people in power are able to subvert criticism and questions and independent analysis, we’re at risk of becoming an authoritative state.”

Papalexakis said he’s considered these and other limitations, but he thinks the more information that members of the public have when they read an article, the better citizens they can be.

“Obviously, techniques like these and big data, like any tool, can be used and abused,” he said. “But in matters like this, it’s always going to be better have a conscious public.”