Robo-journalism: How a computer describes a sports match By Stephen Beckett

BBC Click TV Published duration 11 September 2015

image copyright Getty Images

Much of the promise of artificial intelligence is yet to be realised, but in some areas it's already proving its worth. Meet the robot journalists that one day might steal my job.

Robo-journalism is the process of automatically writing complete and complex news stories without any human intervention. Here are two "robo"-written articles - the first, penned by a program called Wordsmith, created by US company Automated Insights.

News organisation the Associated Press plans to use Wordsmith to write thousands of sports reports, like the one below. But how does a robot journalist work? The short articles below have been chopped up, with key bits highlighted and annotations under each snippet to explain the workings.

Sports reporting

image caption Everything from the headline to the text of this basketball match report was written by a computer program. It has to know how to fit the format - saving characters here by using the abbreviation "UNC", short for University of North Carolina.

image caption Each story starts out as a collection of data, logged during matches. That's things like tables, graphs and lists that might be hard to digest unless you're an expert. The system has a record of who did what and when, down to the second.

image caption The software scours through its trove of data looking for "insights" - facts that it can figure out from the data. Like a human journalist, it's trying to answer the questions who won? By how much? And why? Here it has understood the concept of a "comeback" and has recognised that it's exciting for the reader that the points were scored with only a few seconds on the clock.

image caption To make the article sound natural it has to know the lingo. Each type of story, from finance to sport, has its own vocabulary and style. It also has to match the house rules of the news organisation - an article written for AP might be different to one for Forbes.

image caption To figure out how to structure an article Wordsmith uses a virtual "tree". Each branch of the tree is a possible way to tell the story, by comparing the data it can decide which branch it should follow. This sentence was only included because it decided the reserves scored particularly well.

The same game was also covered by human journalists. Compare the automated effort to their reports: ESPN FOX10TV and CBS Sports

While the facts in the articles are largely the same, ESPN's story opens lyrically: "Marcus Paige ignored the pain in his twice-injured right foot, put his head down and drove toward the rim." Storytelling like this may take computers a while to imitate.

The same article also includes the quote: "'I said jokingly to my teammates that I was back,' Paige said." There's still some way to go before we can expect computers to source and write quotes like this. Fully understanding natural language is one of the biggest challenges in artificial intelligence.

It's not all about sports though. Narrative Science, another company working on robo-journalism tools, can also write convincing articles automatically with their Quill system.

The excerpts below are taken from a Quill-written report on the performance of a stock portfolio.

Financial reporting

image caption This article has a completely different language and style. It may not make for enthralling reading, but that's because it's been intentionally designed to match the look of similar human-written reports. In this case, Quill tries to explain why the portfolio performed the way it did by highlighting trends and other interesting or important data it finds.