Story comprehension

The robots of Westworld are not programmed solely by software developers. The bulk of the work is done by professional writers, who give each character a unique backstory. These stories give them the memories and depth they need to seem real to the park guests. When asked who they are, what they’ve done or why they feel a certain way, they can consult their backstory to find out the answer.

Being able to answer questions about stories is a fundamental requirement for being able to pass the Turing test, which the show tells us started to happen “after the first year of building the park”. But Turing proposed his test as a kind of thought experiment, not as a useful yardstick for measuring progress in AI. A machine either passes or fails and that’s not very useful for figuring out how close we are.

To fix this, in 2015 Facebook’s AI lab introduced the bAbI tests in a paper called “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”. Quoting from the paper’s abstract:

To measure progress towards [building an intelligent dialogue agent], we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is able to answer questions via chaining facts, simple induction, deduction and many more. The tasks are designed to be prerequisites for any system that aims to be capable of conversing with a human.

In other words, before you can hope to pass the Turing test you must learn to pass bAbI.

The test is a large, auto-generated series of simple stories and questions, that test 20 different kinds of mental skill. Here’s one that checks the machine isn’t distracted by irrelevant facts:

Mary went to the bathroom. John moved to the hallway. Mary travelled to the office. Where is Mary? Answer: office

Here’s a harder one that tests basic logical induction:

Lily is a swan. Lily is white. Bernhard is green. Greg is a swan. What color is Greg? Answer: white

The bAbI tests check come in English, Hindi and a scrambled form where the English words are randomly shuffled so the tasks can no longer be understood by humans. To pass the test a machine should get equivalent results on all three: the idea is to learn everything, including the language itself, simply by reading. Programs specifically designed to handle bAbI can obtain near-perfect scores, but what about general AIs that are given only the words and nothing else?

The best result yet is from Facebook AI Research. The results are reported in their December 2016 paper, “Tracking the world state with recurrent entity networks” and their AI can solve all 20 tasks.

You can of course train a neural network both on these tasks and on large question/answer databases too, which yields an AI that can talk about the story using learned real-world knowledge:

Fred went to the kitchen. Fred picked up the milk. Fred travelled to the office. Where is the milk? A: office

Where does milk come from? A: milk come from cow

What is a cow a type of? A: cow be female of cattle

Where are cattle found? A: cattle farm become widespread in brazil What does milk taste like? A: milk taste like milk

What does milk go well with? A: milk go with coffee

Where was Fred before the office ? A: kitchen

Similar algorithms have proven able to read — I kid you not — the Daily Mail, which turns out to be ideal for AI research because the stories come with bullet point summaries of the text (see the DeepMind paper, “Teaching Machines to Read and Comprehend”).

In this task an anonymised news story is presented and the goal is to correctly fill in the X. The answer is “ent23”. The heat map shows which parts of the text the neural network gave the most attention to figure out the answer. The names are randomised to stop AIs from answering questions like “can fish oils cure X?” as “X = cancer” without even reading the document, simply by knowing that cancer is a very commonly cured thing in the Daily Mail.

Remember, this kind of learning works even when the questions are written in a randomised language. It’s real understanding derived from nothing at all except studying raw text.

That’s important because a machine that can learn to answer questions given nothing but words can eventually — if it scales up — learn about the world, and about humanity, by reading books. That’s the next goal for DeepMind, a British AI lab owned by Google that has also done research into story comprehension. And once it’s read the entire contents of Google Books it can go ahead and read a book you wrote just for it: the book that creates its character.

What’s important to understand is that there’s no reason a neural network trained by reading books and backstories would know it is a robot. When it queries its memory with a question like “what am I?” it would retrieve whatever it was trained on. And as books are typically written from the perspective of a human, rather than a robot, that’s the perspective it would access.