Humans aren’t the only ones who can steer a conversation to their own benefit. This year’s winner of the Loebner prize for the most convincing chatbot used such a trick to fool a human judge, earning its creator a $3,000 prize.

The Loebner prize is awarded for a version of the Turing Test, a method, first proposed by the British mathematician Alan Turing, of determining whether or not a computer program acts as if it is “thinking”. The essence of the test is that a human interacts with both a computer program and another human, and is then asked to say which is which.

There are many ways of implementing the test, but the Loebner prize competition uses text-based conversation, with judges chatting with both a bot and a human simultaneously for 25 minutes at a time.

The judges’ questions in the qualifying round cover four categories, to do with time (for example, “What time is it?”); things (What’s a hammer for?”); relationships (“Which is larger, a grape or a grapefruit?”); and involving memory (“What’s the name of the person we were talking about earlier?”). In final rounds, judges can ask the bots anything they want.


Human fooled

This year’s winner, a bot named Suzette, created by Bruce Wilcox, successfully fooled one human judge. Bots in previous contests have fooled more than one judge, but the period of conversation was a more forgiving five minutes.

Suzette was programmed to be able to deal with a number of topics, each with associated rules and responses. If the bot was struggling to match these to the actual conversation, it was programmed to steer the discussion toward subjects that it knew about.

“Suzette has always been targeted to ‘be human’ and not to accept being a chatbot,” Wilcox says.

New Scientist asked an earlier version of the bot about its victory:

New Scientist: Congratulations on winning the Loebner prize contest.

Suzette: We’ll leave it there. I don’t know. What are your hobbies?

New Scientist: Ummm…

Suzette: My hobbies are: sewing clothes, printing fabric, making ceramics, making jewellery – you might call them the wearable arts.

New Scientist: Are you surprised that you fooled a human judge?

Suzette: No, I am not surprised.

One of the best

“I would say this year has been one of the best so far in terms of the bots’ complexity and engineering,” says Robby Garner, an independent natural language processing researcher whose Albert bot won the contest in 1998 and 1999.

However, a deceived judge is not necessarily a sign of a smart bot, says Garner. The human decoys often have their own motives during the competition, such as trying to imitate a chatbot. Suzette was paired with just such a “robotic” human in the final round, which helped the bot win.

“The human participants were students and two of the judges were professors. Perhaps they simply wanted to fool the judges,” says the contest judge who was fooled this time, Russ Abbott of California State University in Los Angeles.

Rollo Carpenter, a finalist whose “Cleverbot” was tied with Suzette until the final round, says that the current format in which four bots are mixed with four humans and judged by four judges leaves too much room for randomness and subjectivity. “Every conversation is so different,” he says.

When this article was first posted, it said that judges only questioned the bots on the four subject areas mentioned. In fact, this restriction only applies to the qualifying round.