At TED, in early 2018, the futurist and inventor Ray Kurzweil, currently a director of engineering at Google, announced his latest project, “Google Talk to Books,” which claimed to use natural language understanding to “provide an entirely new way to explore books.” Quartz dutifully hyped it as “Google’s astounding new search tool [that] will answer any question by reading thousands of books.”

If such a tool actually existed and worked robustly, it would be amazing. But so far it doesn’t. If we could give computers one capability that they don’t already have, it would be the ability to genuinely understand language. In medicine, for example, several thousand papers are published every day; no doctor or researcher can possibly read them all. Drug discovery gets delayed because information is locked up in unread literature. New treatments don’t get applied, because doctors don’t have time to discover them. AI programs that could synthesize the medical literature–or even just reliably scan your email for things to add to your to-do list—would be a revolution.

This article is adapted from Rebooting AI: Building Artificial Intelligence We Can Trust, by Gary Marcus and Ernest Davis. Marcus is Founder and CEO of Robust.AI and a professor emeritus at NYU. Davis is a professor of computer science at NYU. Pantheon

But drill down into tools like Google Talk to Books (GTB) and you quickly realize we are nowhere near genuine machine reading yet. When we asked GTB, “Where did Harry Potter meet Hermione Granger?” only six of the 20 answers were even about Harry Potter; most of the rest were about other people named Harry or on completely unrelated topics. Only one mentioned Hermione, and none answered the question. When we asked GTB, “Who was the oldest Supreme Court justice in 1980?" we got another fail. Any reasonably bright human could go to Wikipedia’s list of Supreme Court justices and figure out that it was William Brennan. Google Talk to Books couldn’t; no sentence in any book that it had digested spelled out the answer in full, and it had no way to make inferences beyond what was directly spelled out.

The most telling problem, though, was that we got totally different answers depending on how we asked the question. When we asked GTB, “Who betrayed his teacher for 30 pieces of silver?” a famous incident in a famous story, only three out of the 20 correctly identified Judas. Things got even worse as we strayed from the exact wording of “pieces of silver.” When we asked a slightly less specific questions, “Who betrayed his teacher for 30 coins?” Judas only turned up in one of the top 20 answers; and when we asked “Who sold out his teacher for 30 coins?” Judas disappeared from the top 20 results altogether.

To get a sense for why robust machine reading is still such a distant prospect, it helps to appreciate—in detail—what is required even to comprehend a children’s story.

Suppose that you read the following passage from Farmer Boy, a children’s book by Laura Ingalls Wilder. Almanzo, a 9-year-old boy, finds a wallet (then called a “pocketbook”) full of money dropped in the street. Almanzo’s father guesses that the pocketbook might belong to Mr. Thompson, and Almanzo finds Mr. Thompson at one of the stores in town.

Almanzo turned to Mr. Thompson and asked, “Did you lose a pocketbook?” Mr. Thompson jumped. He slapped a hand to his pocket, and fairly shouted.

“Yes, I have! Fifteen hundred dollars in it, too! What about it? What do you know about it?”

“Is this it?” Almanzo asked.

“Yes, yes, that’s it!” Mr. Thompson said, snatching the pocketbook. He opened it and hurriedly counted the money. He counted all the bills over twice. … Then he breathed a long sigh of relief and said, “Well, this durn boy didn’t steal any of it.”