I laugh at your puny test (Image: Konstantin Inozemtsev/Getty)

Are you human? It just got a lot harder for websites to tell. An artificial intelligence system has cracked the most widely used test of whether a computer user is a bot. And according to its designers, it is more than a curiosity – it is a step on the way to human-like artificial intelligence.

Asking people to read distorted text is a common way for websites to determine whether or not a user is human. These CAPTCHAs – which stands for Completely Automated Public Turing test to tell Computers and Humans Apart – can theoretically take on any form, but the text version has proven effective in stopping spam and malicious software bots.

That’s because software has trouble deciphering text when letters are warped, overlapping or obfuscated by random lines, dots and colours. Humans, on the other hand, can recognise nearly endless variations of a letter after having only seen it a few times.


Vicarious, a start-up firm in Union City, California, announced this week that it has built an algorithm that can defeat any text-based CAPTCHA – a goal that has long eluded security researchers. It can pass Google’s reCAPTCHA, regarded as the most difficult, 90 per cent of the time, says Dileep George, co-founder of the firm. And it does even better against CAPTCHAs from Yahoo, Paypal and CAPTCHA.com.

Virtual neurons

George says the result isn’t as important as the methods, which he and CEO Scott Phoenix hope will lead to more human-like AI. Their program uses virtual neurons connected in a network modelled on the human brain. The network starts with nodes that detect input from the real world, such as whether a specific pixel in an image is black or white. The next layer of nodes “fires” only if they detect a particular arrangement of pixels. A third layer fires only if its nodes recognise arrangements of pixels that form whole or partial shapes. This process repeats on between three and eight levels of nodes, with signals passing between as many as 8 million nodes. The network eventually settles on a best guess for which letters are contained in the image.

The strength of each neural connection is determined by training the network with solved CAPTCHAs and videos of moving letters. This allows the system to develop its own representation of, say, the letter “a”, instead of cross-referencing against a database of instances of the letter. “We are solving it in a general way, similar to how humans solve it,” says George.

Yann LeCun, an AI researcher at New York University, says neural network-based systems are widely deployed. He thinks it is hard to know whether Vicarious’s system represents a technological leap, because the company hasn’t revealed details about it.

If Vicarious’s claims pan out, it would be very significant, says Selmer Bringsjord, a computer scientist at Rensselaer Polytechnic Institute in Troy, New York. He says breaking text-based CAPTCHAs requires a high-level understanding of what letters are.

Rather than bringing a product to market, Vicarious will pit its tool against more Turing tests. The aim is for it to tell what is happening in complex scenes or to work out how to adapt a simple task so it works somewhere else, says Phoenix (see “More than words”, below). This kind of intelligence might enable things like robotic butlers, which can function in messy, human environments.

“Our focus is to solve the fundamental problems,” says Phoenix. “We’re working on artificial intelligence, and we happened to solve CAPTCHA along the way.”

More than words A CAPTCHA doesn’t have to involve text – it can be any automated test that sorts humans from software. Vicarious in Union City, California, has a system that can read distorted text, but the firm has greater ambitions for artificial intelligence. Next up will be coping with optical illusions. Dileep George, one of the firm’s co-founders, thinks more training could help the algorithm with tasks such as recognising three-dimensional symbols in a two-dimensional image. After that, the challenge might be to identify an object in a clean or distorted image. After that, it would have to work out what is happening in an image, rather than just recognise objects in a picture.

This article will appear in print under the headline “CAPTCHAs cracked”