Text-based CAPTCHA remain one of the most visible and commonly used mechanisms for website security. As a sort of online gatekeeper that distinguishes between humans and bots, the little solvable image fields have critical commercial applications in blocking automatic spam and preventing e-transfer fraud; and can also stop bots from spreading fraudulent information, etc.

CAPTCHA (an acronym for “Completely automated public Turing test to tell computers and humans apart”) have done their job pretty well for almost 20 years. But they present obvious targets for researchers and hackers alike, and have become vulnerable. Recent techniques used to beat CAPTCHA have however been labour-intensive.

A new artificial intelligence technique from researchers at the U.K.’s Lancaster University and China’s Northwest University and Peking University can outsmart the little security veterans more effectively than ever, and with much less training.

The team’s text CAPTCHA solver uses a revolutionary AI-based image recognition method that significantly outperforms four state-of-the-art text-captcha solvers on more than 33 different CAPTCHA schemes, including high-level models from industry leaders such as Microsoft, eBay and Wikipedia. Using only the computing power available on a regular desktop CPU, CAPTCHA were solved with crack times as short as 0.05 seconds.

Researchers built the CAPTCHA solver in four steps:

CAPTCHA synthesis: Leverage a GAN-based generator to create output images similar to the target captchas.

Preprocessing: Use the GAN Pix2Pix model to clean up security features and standardize font style

Training the base solver: Train the CAPTCHA solver with a dataset built through the above steps on a convolutional neural network (CNN) model.

Fine-tuning the base solver: Improve CAPTCHA solver performance by applying a small set of manually labelled CAPTCHA from target websites using transfer learning.

CAPTCHA solver success rate cracking CAPTCHA schemes on different websites

In the end, the white hat hackers who conduct such text CAPTCHA attacks stress that they do so hoping their methods can provide insights which will help security experts improve on modern text CAPTCHA schemes.

The paper Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach is here.