A little background

Usually, optical character recognition (OCR) includes two steps: first, a step to detect bounding boxes that contain text; second, it interprets those bounding boxes as paragraphs, lines, and words.

In order to simplify this process, we’ve cropped all images to bounding boxes so the libraries can focus more on recognition and less on detection. We’ve also made sure that each image contains just a single word to make this process simpler. Some sample images:

Sample from the dataset used

We iterated over our dataset and pushed each of them to both of these libraries. RNTextDetector’s comparison branch exposes the same API for both of these libraries. If the text is detected in an image, it’s returned along with the time consumed during this detection.

Summary of results

ML Kit outperforms Tesseract OCR by a large margin, as witnessed above. Though it consumes more time on the CPU [that might be an issue for real-time detection], the significant margin of correction shadows this aspect.

Even in cases where it failed to recognize correctly, the correctness percentages were much better as compared to Tesseract OCR’s failures.

Diving a little deeper

In order to get a more complete picture of where both of these libraries perform better, we need to analyze the results of these libraries separately.

Success for both — Total 288

A total of 288 images were recognized perfectly by both of these libraries. For these matches, Tesseract OCR performs much quicker than Firebase’s ML Kit.

Failure for both— Total 207

207 images were detected with text, but both libraries were unable to recognize the text perfectly. ML Kit was better in this case, as it had a higher correctness rate.

Misery for both — Total 15

In 15 images, none of these libraries was able to detect the text in the image.

Firebase was better — Total 316

In 316 images, Firebase’s ML Kit performed perfectly while Tesseract OCR was unable to recognize the text correctly.

Tesseract took the lead — Total 29

In 29 images, Tesseract OCR performed perfectly while Firebase’s ML Kit was unable to recognize text correctly. Even though ML Kit didn’t perform very well, the correctness levels on a number of images were higher than we witnessed in the previous comparison.