Moving Beyond The Turing Test To Judge Artificial Intelligence

A computer program known as "Eugene Goostman" passed the Turing Test by convincing a group of people, via chat, that it was actually a 13-year-old boy. Cognitive scientist Gary Marcus argues that the Turing Test needs an update for the 21st Century.

ARUN RATH, HOST:

It's ALL THINGS CONSIDERED from NPR West. I'm Arun Rath. The code breaking skills of mathematician Alan Turing helped the Allies win World War II. He also devised the Turing Test, a measure of artificial intelligence. Last week, a computer program pretending to be a 13-year-old boy named Eugene Gustman was the first to pass the test - meaning the age of artificial intelligence has begun - maybe. Gary Marcus is a professor of cognitive science at New York University. I asked him to explain how the test works.

GARY MARCUS: It was devised in the teletype era. So those of you who don't remember what a teletype is can think of it as text messaging back and forth. So you text message to a computer or a person, you don't know which. And you try to decide. And that's basically what actually happened, is somebody managed to build a computer program that was good enough to fool a third of the judges.

RATH: Siri kind of does that, right? You ask your phone questions, and you can even throw in some abstract ones and it will come up with some creative responses, on occasion.

MARCUS: Siri, like this program Eugene Gustman, interacts with you. You ask questions, and it gives answers. But most people aren't fooled into thinking that Siri is an actual person.

RATH: People who are into science, I think, especially, get really excited when we see this headline, computer passes the Turing Test. But you think, maybe not such a big deal.

MARCUS: Well, it's a small sign of progress. But it's not really progress towards the larger goal of having machines that really understand us. It turns out that you can do a lot of misdirection, answer sarcastically, and evade the fact that you are a computer. So all it really shows is you can fool humans for a short period of time, about five minutes - not all of the humans, but maybe more than you might've expected - by having these sort of personality twitches.

RATH: Well, let's give an example of that. You had an exchange with an earlier version of this program. Well, why don't we read that? I will read the part of Eugene Gustman.

MARCUS: I will play the role of myself as best I can. (Reading) Do you read the New Yorker?

RATH: (As Eugene Gustman, reading) I read a lot of books. So many, I don't even remember which ones.

MARCUS: (Reading) You remind me of Sarah Palin.

RATH: (As Eugene Gustman, reading) If I'm not mistaken, Sarah is a robot, just as many other, quote-unquote, "people." We must destroy the plans of these talking trash cans.

MARCUS: Now, see, your first impression, if you had that line in isolation, would be, hey, that's clever. It sounds like a 13-year-old boy, which is what the program's supposed to sound like. But if you talk to it for a long time, you would see that it uses the same laugh lines over and over again. If you asked it questions about common sense, you would find out very quickly that it doesn't really understand the world, that it's just a lot of preprogrammed responses, and pattern recognition and so forth, without any real there, there.

RATH: You know, I have to say, from all of the exchanges that I've read with people who are talking with this machine, none of them seem very convincing. Thirty percent of people need to be fooled for it to be considered to pass?

MARCUS: Based on a very, I think, foolish reading of Alan Turing's original essay. People decided that in order to quote, "pass this test," close quote, you would have to fool 30 percent of the judges. And this program fooled 33 percent of the judges on this particular occasion. Turing was saying, I'm guessing that this is how far along we will be by the year 2000. He wasn't saying, and when we get there, that means that machines are necessarily intelligent. He was kind of opening the question. But people have taken his offhand suggestion as if it was the definition of intelligence. And I think that part's silly. And we might actually want to think about updating the Turing Test for a modern era.

RATH: Yeah 'cause it seems, again, looking at these exchanges, maybe just 33 percent of people are more gullible, I hate to say.

MARCUS: Yeah, it is kind of a gullibility test. What I suggested, instead, is have a machine watch a YouTube video or a TV broadcast, and see if you can ask questions about it. So in order for a machine to understand an ongoing television program, an episode of "Breaking Bad," the program would have to build up ideas over time - well, this character does this. This is their motivation. This is the conflict. If we could get machines towards that, I think that would be real progress in AI. The Turing Test itself, not so much.

RATH: So if this test doesn't really serve the purpose of determining whether we've broken through a barrier in artificial intelligence, is there anything valuable that can come out of it?

MARCUS: It gets people thinking about, what will it be like when we do have intelligent machines that we really can interact with at that level, that follow us around? - sort of like in that movie, "Her." I don't think that's going to happen in the next 20 years. It might take 40 or 50. But we will reach a society where we have more and more of these things around.

RATH: Gary Marcus is a professor of cognitive science at New York University. He is the author of "Guitar Zero," and his forthcoming book is called, "The Future Of The Brain." Gary, thank you.

MARCUS: Thanks a lot.

Copyright © 2014 NPR. All rights reserved. Visit our website terms of use and permissions pages at www.npr.org for further information.

NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.