Black Hat Asia Google's and Facebook's CAPTCHA services have been defeated in research that successfully designed an automated system to solve the "are-you-human?" verification challenges.

CAPTCHAS are designed to make life easier for trusted users and painful for bots, by presenting challenges that are difficult for software to crack.

Suphannee Sivakorn ... Image: Darren Pauli, The Register

Low-risk users are presented with noCAPTCHA where no challenge is required other than to tick a box, while surfers with higher risk scores must solve the image and text reCATPCHA challenges common throughout web registration pages.

The attack system developed by Columbia University trio Suphannee Sivakorn, Jason Polakis, and Angelos D Keromytis was presented at Black Hat Asia in Singapore last week.

It had a 70.78 percent CAPTCHA-cracking success rate in tests against 2235 CAPTCHAs, with an average running time of 19.2 seconds.

It could also be applied to other CAPTCHA schemes, including those used by Facebook, the trio says, with a higher accuracy of 83.5 per cent.

Google has hardened its CAPTCHA services against the described attack, while Facebook did not respond to the researchers.

The team says in the I'm not a human: Breaking the Google reCAPTCHA [PDF] that its attack service is more accurate and faster at cracking all forms of CAPTCHA than existing cracking services that rely on underpaid humans to solve challenges.

"While the accuracy may increase over time as the human solvers become more accustomed to the image reCaptcha, it is evident that our system is a cost-effective alternative," they say.

"Nonetheless, our completely offline CAPTCHA-breaking system is comparable to a professional solving service in both accuracy and attack duration, with the added benefit of not incurring any cost on the attacker."

It could, at current market rates, make attackers US$110 a day per IP address.

The team found that security controls around Google's CAPTCHAs, designed to ensure correct answer tokens are not altered, can be bypassed.

They were also able to spin up a virtual host that assumed the necessary identifying criteria of a legitimate user, and could then generate clean cookies to solve CAPTCHAs out of the normal bounds.

This allows for the creation of black hat CAPTCHA breaking services, in a proof-of-concept in which the team created from one IP address a whopping 63,000 CAPTCHA-busting cookies in 24 hours without tripping security alarms.

"We are able to create over 63,000 cookies in a single day without triggering any mechanisms or getting blocked, and are only limited by the physical capabilities of the machine," the said.

"This indicates that there is no mechanism to prohibit the creation of cookies from a single IP address."

The peak rate consumed some 2,500 CAPTCHAs per hour, a value that dropped to 1,200 during peak hours and reached 59,000 per day on weekends.

This was thanks to variances in Google's anti-distributed denial-of-service attack defences, which flicked on and off thanks to the "massive" number of CAPTCHA requests the team sent.

Worth a thousand words

The team found flexibility in accepted answers within the image-solving process and built a system that could automate solving.

Further work found reCAPTCHAs pulled images from a small pool. Each pic had a different MD5 hash, but the researchers were able to use perpetual hashes to identify identical images, the most common being seen a dozen times.

An automated attack system pulled hint and tag metadata information plus a variety of deep learning systems to create information about candidate CAPTCHA images, to assist with solving. ®