0 Linkedin email

reCAPTCHA is one of the most innovative ideas that I’ve ever come across. What is fascinating is that reCAPTCHA solves a real human problem in such a simple an innovative way. To some degree, Luis Von Ahn, the inventor of reCAPTCHA, followed the Plan-Do-Check-Act (PDCA) framework in Lean – without even knowing it.

>

What if we framed the reCAPTCHA solution in the framework of an A3 Report? What would that look like?

Background

A CAPTCHA is a program that can distinguish whether its user is a human or a machine. They are colorful images with distorted text, typically found in registration forms or comment forms on blogs. The CAPTCHA is used on millions of websites and has become the de facto standard for vetting whether the user is human or a machine.

About 200 million CAPTCHAs are solved by humans every day. On average it takes 10 seconds to complete a CAPTCHA. Based on Ahn’s research, over 150,000 human hours are spent on solving CAPTCHA puzzles everyday.

Problem

Book archiving involves scanning pages of books into electronic form. But, a common defect in Optical Character Recognition (OCR) are jumbled words. For example,

OCR technology is not perfect and, up until now, there was no effective solution. Hence, archiving the world’s books poses a challenge because some of the text is unreadable.

Root Cause Analysis

Why are there jumbled words in books that are scanned?

=> Because pages are sometimes bent.

=> Because printed words are too close to each other.

=> Because ink on printed word is too dark or is smudged.

Countermeasure

This is where reCAPTCHA comes into the picture. reCAPTCHA reduces the incidence of unreadable OCR words by sending words that cannot be read in the form of a CAPTCHA for humans to read and enter into a CAPTCHA form. Humans are incredibly good at making out words, even if they are jumbled or scrambled. But it doesn’t stop there. That same unreadable OCR word is then given to other users around the world to enter into a CAPTCHA form. Receiving multiple answers for the same word increases the confidence of what that word really is.

Before Kaizen and After Kaizen

Before, we were spending over 150,000 hours per day and there was no effective solution to correcting unreadable OCR words. After, the 150,000 is still spent entering CAPTCHA forms, but now that human effort is being used to correct previously unreadable OCR words.

So, while there was no reduction in time, that time is now being used to solve the problem of unreadable OCR words, making digitizing the world’s books even more possible.

Conclusion

Well, Google found so much value in reCAPTCHA that they acquired it and are now using it as a key part of Google Books. Clearly, innovation that solves a real human problem and makes use of the time that the world was already spending in something non-value-add to something value-add.