The possibility that advanced artificial intelligence (AI) might one day turn against its human creators has been repeatedly raised of late. Renowned physicist Stephen Hawking, for instance, surprised by the ability of his newly-upgraded speech synthesis system to anticipate what he was trying to say, has suggested that, in the future, AI could surpass human intelligence and ultimately bring about the end of humankind.

Hawking is not alone in worrying about superintelligent AI. A growing number of futurologists, philosophers and AI researchers have expressed concerns that artificial intelligence could leave humans outsmarted and outmanoeuvred. My view is that this is unlikely, as humans will always use an improved AI to improve themselves. A malevolent AI will have to outwit not only raw human brainpower but the combination of humans and whatever loyal AI-tech we are able to command – a combination that will best either on their own.

There are many examples already: Clive Thompson, in his book Smarter Than You Think describes how in world championship chess, where AIs surpassed human grandmasters some time ago, the best chess players in the world are not humans or AIs working alone, but human-computer teams.

While I don’t believe that surpassing raw (unaided) human intelligence will be the trigger for an apocalypse, it does provide an interesting benchmark. Unfortunately, there is no agreement on how we would know when this point has been reached.

Beyond the Turing Test

An established benchmark for AI is the Turing Test, developed from a thought experiment described by the late, great mathematician and AI pioneer Alan Turing. Turing’s practical solution to the question: “Can a machine think?” was an imitation game, where the challenge is for a machine to converse on any topic sufficiently convincingly that a human cannot tell whether they are communicating with man or machine.

In 1991 the inventor Hugh Loebner instituted an annual competition, the Loebner Prize, to create an AI – or what we would now call a chatbot – that could pass Turing’s test. One of the judges at this year’s competition, Ian Hocking, reported in his blog that if the competition entrants represent our best shot at human-like intelligence, then success is still decades away; AI can only match the tip of the human intelligence iceberg.

I’m not overly impressed either by the University of Reading’s recent claim to have matched the conversational capability of a 13-year-old Ukrainian boy speaking English Imitating child-like intelligence, and the linguistic capacity of a non-native speaker, falls well short of meeting the full Turing Test requirements.

Indeed, AI systems equipped with pattern-matching, rather than language understanding, algorithms have been able to superficially emulate human conversation for decades. For instance, in the 1960s the Eliza program was able to give a passable impression of a psychotherapist. Eliza showed that you can fool some people some of the time, but the fact that Loebner’s US$25,000 prize has never been won demonstrates that, performed correctly, the Turing test is a demanding measure of human-level intelligence.

Measuring artificial creativity

So if the Turing test cannot yet be passed, are there aspects of human intelligence that AI can recreate more convincingly? One recent proposal from Mark Riedl, at Georgia Tech in the USA, is to test AI’s capacity for creativity.

Riedl’s Lovelace 2.0 test requires the AI to create an artifact matching a plausible, but arbitrarily complex, set of design constraints. The constraints, set by an evaluator who also judges its success, should be chosen so that meeting them would be deemed as evidence of creative thinking in a person, and so by extension in an AI.

For example the evaluator might ask the machine to (as per Riedl’s example): “create a story in which a boy falls in love with a girl, aliens abduct the boy and the girl saves the world with the help of a talking cat”. A crucial difference from the Turing test is that we are not testing the output of the machine against that of a person. Creativity, and by implication intelligence, is judged by experts. Riedl suggests we leave aside aesthetics, judging only whether the output meets the constraints. So, if the machine constructs a suitable science fiction tale in which Jack, Jill and Bagpuss, repel ET and save Earth, then that’s a pass – even thought the result is somewhat unoriginal as a work of childrens’ fiction.

I like the idea of testing creativity – there are talents that underlie human inventiveness that AI developers have not even begun to fathom. But the essence of Riedl’s test appears to be constraint satisfaction – problem solving. Challenging, perhaps, but not everyone’s idea of creativity. And by dropping the competitive element of Turing’s verbal tennis match, judging Lovelace 2.0 is left too much in the eye of the beholder.

Surprises to come

Ada Lovelace, the friend of Charles Babbage who had a hand in inventing the computer, and for whom Riedl named his test, famously said that “the Analytical Engine [Babbage’s computer] has no pretensions to originate anything. It can do whatever we know how to order it to perform”. This comment reflects a view, still widely held, that the behaviour of computer programs is entirely predictable and that only human intelligence is capable of doing things that are surprising and hence creative.

However, in the past 50 years we have learned that complex computer programs often show “emergent” properties unintended by their creators. So doing something unexpected in the context of Riedl’s test may not be enough to indicate original thinking. Human creativity shows other hallmarks that reflect our ability to discover relationships between ideas, where previously we had seen none. This may happen by translating images into words then back into images, ruminating over ideas for long periods where they are subject to subconscious processes, shuffling thoughts from one person’s brain to another’s through conversation in a way that can inspire concepts to take on new forms. We are far from being able to do most of these things in AI.

For now I believe AI will be most successful when working alongside humans, combining our ability to think imaginatively with the computer’s capacity for memory, precision and speed. Monitoring the progress of AI is worthwhile, but it will be a long time before these tests will demonstrate anything other than how far machine intelligence still has to go before we will have made our match.

All things considered, I don’t think we need to hit the panic button just yet.