Extracting Text from Images Android

In this post, you can learn, using an example, how to use Google text recognizer to extract text from pictures or documents. The example app displays list of photos captured by device camera and lets user select one image. The text contained in the selected image will extracted and displayed.

Setup

To setup the text extractor example project, first you need to create Firebase project and register your android app to it by following instructions here.

Then download google-services.json file created in the previous step and save it under app directory in your android project.

Then add google services plugin to project level gradle build file.

classpath 'com.google.gms:google-services:4.2.0'

Then apply google services plugin to your app by adding below line at bottom of module level gradle build file.

apply plugin: 'com.google.gms.google-services'

Then add firebase core and Firebase ML kit dependencies to module gradle build file.

implementation 'com.google.firebase:firebase-core:17.0.0' implementation 'com.google.firebase:firebase-ml-vision:21.0.0'

Firebase Text Recognizer API

To extract text from an image, first create FirebaseVisionImage instance by calling fromBitmap() method on FirebaseVisionImage class and passing bitmap object of the image from which text needs to be extracted. The image in bitmap should be upright. You can create upright image by rotating as required using EXIF data. You can see following section for more information on that.

FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);

Then create FirebaseVisionTextRecognizer object. To use on-device text recognizer, create FirebaseVisionTextRecognizer object by calling getOnDeviceTextRecognizer() method on FirebaseVision object.

FirebaseVisionTextRecognizer textRecognizer = FirebaseVision.getInstance() .getOnDeviceTextRecognizer();

To use on-cloud text recognizer, create FirebaseVisionTextRecognizer object by calling getCloudTextRecognizer() method on FirebaseVision object.

FirebaseVisionTextRecognizer textRecognizer = FirebaseVision.getInstance() .getCloudTextRecognizer();

Then call processImage() method on FirebaseVisionTextRecognizer object passing FirebaseVisionImage object to it. The method runs asynchronously and extracts text from the image. Then add success and failure listeners to the Task object returned by processImage() method.

Task<FirebaseVisionText> result = textRecognizer.processImage(image) .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() { @Override public void onSuccess(FirebaseVisionText firebaseVisionText) { //process success } }) .addOnFailureListener( new OnFailureListener() { @Override public void onFailure(@NonNull Exception e) { //process failure } });

Using FirebaseVisionText object, complete text from the image can be obtained by calling getText() method and blocks of text can be obtained by calling getTextBlocks() method.

//complete text from image firebaseVisionText.getText() //reading text blocks for(FirebaseVisionText.TextBlock tb : firebaseVisionText.getTextBlocks()){ //get text in a block tb.getText(); //read line by line for(FirebaseVisionText.Line l : tb.getLines()){ l.getText(); } }

There is a separate API for extracting text from documents, you need to create FirebaseVisionDocumentTextRecognizer object as shown below.

FirebaseVisionDocumentTextRecognizer textRecognizerD = FirebaseVision.getInstance() .getCloudDocumentTextRecognizer(); textRecognizerD.processImage(firebaseVisionImg) .addOnSuccessListener(new OnSuccessListener<FirebaseVisionDocumentText>() { @Override public void onSuccess(FirebaseVisionDocumentText result) { //capture text } }) .addOnFailureListener(new OnFailureListener() { @Override public void onFailure(@NonNull Exception e) { //do something } });

Making Image Upright

Before the image to be processed is passed to firebase ML text recognizer API, you need to make sure that the image is upright. Exif data of images can be used to know the orientation of them and depending on the orientation of images, they need to be rotated to make them upright. You can check the code in getUprightImage method of ImageTextReader Class provided below.

Example

The image text extractor example contains two screens. Main screen displays a list of camera images in recycler view.

Clicking an image in recycler view triggers text extraction process for the image and the extracted text will be displayed on the next screen.

Activity

Main activity first gets path information of camera images, creates a list and passes it to recycler view. Recycler view adapter fetches images as required and displays them on the screen.

Since images are stored in external storage directory /DCIM/Camera, user needs to grant android.permission.READ_EXTERNAL_STORAGE permission in order for our example app to access camera images.

import android.content.Context; import android.content.pm.PackageManager; import android.database.Cursor; import android.os.Bundle; import android.os.Environment; import android.provider.MediaStore; import androidx.appcompat.app.AppCompatActivity; import androidx.core.app.ActivityCompat; import androidx.core.content.ContextCompat; import androidx.recyclerview.widget.DividerItemDecoration; import androidx.recyclerview.widget.LinearLayoutManager; import androidx.recyclerview.widget.RecyclerView; import java.util.ArrayList; import java.util.List; public class MainActivity extends AppCompatActivity { private RecyclerView recyclerView; private static final String READ_EXTERNAL_STORAGE = "android.permission.READ_EXTERNAL_STORAGE"; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); recyclerView = findViewById(R.id.cam_images); LinearLayoutManager recyclerLayoutManager = new LinearLayoutManager(this); recyclerView.setLayoutManager(recyclerLayoutManager); DividerItemDecoration dividerItemDecoration = new DividerItemDecoration(this, recyclerLayoutManager.getOrientation()); dividerItemDecoration.setDrawable(getResources() .getDrawable(android.R.drawable.divider_horizontal_dark, null)); recyclerView.addItemDecoration(dividerItemDecoration); //to access camera photos, get READ_EXTERNAL_STORAGE permission if (ContextCompat.checkSelfPermission(this, READ_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED) { ActivityCompat.requestPermissions(this, new String[]{READ_EXTERNAL_STORAGE}, 2); } displayPhotos(); } private void displayPhotos(){ ImagesRecyclerViewAdapter recyclerViewAdapter = new ImagesRecyclerViewAdapter(getCameraImages(this), this); recyclerView.setAdapter(recyclerViewAdapter); } @Override public void onRequestPermissionsResult(int requestCode, String permissions[], int[] grantResults) { switch (requestCode) { case 2: { if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) { displayPhotos(); } } } } //get list of camera photos urls public static List<String> getCameraImages(Context context) { final String CAMERA_IMAGES = Environment .getExternalStorageDirectory().toString() + "/DCIM/Camera"; final String CAMERA_IMAGES_ID = String.valueOf( CAMERA_IMAGES.toLowerCase().hashCode()); final String[] projection = {MediaStore.Images.Media.DATA}; final String selection = MediaStore.Images.Media.BUCKET_ID + " = ?"; final String[] selectionArgs = {CAMERA_IMAGES_ID}; final Cursor cursor = context.getContentResolver() .query(MediaStore.Images.Media.EXTERNAL_CONTENT_URI, projection, selection, selectionArgs, null); ArrayList<String> result = new ArrayList<String>(cursor.getCount()); if (cursor.moveToFirst()) { final int dataColumn = cursor.getColumnIndexOrThrow(MediaStore.Images.Media.DATA); do { final String data = cursor.getString(dataColumn); result.add(data); } while (cursor.moveToNext()); } cursor.close(); return result; } }

Activity Layout

Activity layout defines recycler view.

<?xml version="1.0" encoding="utf-8"?> <androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:app="http://schemas.android.com/apk/res-auto" xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent" android:layout_height="match_parent" tools:context=".MainActivity"> <androidx.recyclerview.widget.RecyclerView android:id="@+id/cam_images" android:scrollbars="vertical" android:layout_width="match_parent" android:layout_height="match_parent" app:layout_constraintBottom_toBottomOf="parent" app:layout_constraintLeft_toLeftOf="parent" app:layout_constraintRight_toRightOf="parent" app:layout_constraintTop_toTopOf="parent" /> </androidx.constraintlayout.widget.ConstraintLayout>

RecyclerView Adapter

Recycler view adapter provides resized camera images for recycler view to display on the screen.

import android.content.Context; import android.content.Intent; import android.graphics.Bitmap; import android.view.LayoutInflater; import android.view.View; import android.view.ViewGroup; import android.widget.ImageView; import androidx.recyclerview.widget.RecyclerView; import java.util.List; public class ImagesRecyclerViewAdapter extends RecyclerView.Adapter<ImagesRecyclerViewAdapter.ViewHolder> { private List<String> imageLst; private Context context; public ImagesRecyclerViewAdapter(List<String> list, Context ctx) { imageLst = list; context = ctx; } @Override public int getItemCount() { return imageLst.size(); } @Override public ImagesRecyclerViewAdapter.ViewHolder onCreateViewHolder(ViewGroup parent, int viewType) { View view = LayoutInflater.from(parent.getContext()) .inflate(R.layout.photo_item, parent, false); ImagesRecyclerViewAdapter.ViewHolder viewHolder = new ImagesRecyclerViewAdapter.ViewHolder(view); return viewHolder; } @Override public void onBindViewHolder(ImagesRecyclerViewAdapter.ViewHolder holder, int position) { final int itemPos = position; final String imagePath = imageLst.get(position); Bitmap bitmap = ImageTextReader.getUprightImage(imagePath); holder.image.setImageBitmap(ImageTextReader.resizeImage(bitmap, context)); holder.image.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View view) { //process selected image startReadImageTextActivity(imagePath); } }); } public class ViewHolder extends RecyclerView.ViewHolder { public ImageView image; public ViewHolder(View view) { super(view); image = (ImageView) view.findViewById(R.id.camera_image); } } public void startReadImageTextActivity(String imagePath){ Intent intent = new Intent(context, ImageTextActivity.class); intent.putExtra("IMAGE_PATH", imagePath); context.startActivity(intent); } }

RecyclerView Item Layout

<?xml version="1.0" encoding="utf-8"?> <androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:app="http://schemas.android.com/apk/res-auto" android:layout_marginTop="8dp" android:layout_width="wrap_content" android:layout_height="wrap_content"> <ImageView android:id="@+id/camera_image" android:layout_width="wrap_content" android:layout_height="wrap_content" app:layout_constraintLeft_toLeftOf="parent" app:layout_constraintRight_toRightOf="parent" app:layout_constraintTop_toTopOf="parent" /> </androidx.constraintlayout.widget.ConstraintLayout>

Text Recognizer Utility Class

The utility class contains code to resize images, make images upright and extract text from images using Firebase ML text recognizer API.

import android.app.Activity; import android.content.Context; import android.graphics.Bitmap; import android.graphics.BitmapFactory; import android.graphics.Matrix; import android.media.ExifInterface; import android.util.DisplayMetrics; import android.widget.TextView; import androidx.annotation.NonNull; import com.google.android.gms.tasks.OnFailureListener; import com.google.android.gms.tasks.OnSuccessListener; import com.google.android.gms.tasks.Task; import com.google.firebase.ml.vision.FirebaseVision; import com.google.firebase.ml.vision.common.FirebaseVisionImage; import com.google.firebase.ml.vision.text.FirebaseVisionText; import com.google.firebase.ml.vision.text.FirebaseVisionTextRecognizer; import java.io.IOException; public class ImageTextReader { //get orientation of an image from exif data of image //and perform rotation as required to make it upright public static Bitmap getUprightImage(String imgUrl){ ExifInterface exif = null; try { exif = new ExifInterface(imgUrl); } catch (IOException e) { } int orientation = exif.getAttributeInt(ExifInterface.TAG_ORIENTATION, 1); int rotation = 0; switch (orientation) { case 3: rotation = 180; break; case 6: rotation = 90; break; case 8: rotation = 270; break; } Matrix matrix = new Matrix(); matrix.postRotate(rotation); Bitmap bitmap = BitmapFactory.decodeFile(imgUrl); //rotate image bitmap = Bitmap.createBitmap(bitmap, 0, 0, bitmap.getWidth(), bitmap.getHeight(), matrix, true); return bitmap; } //resize image to device width public static Bitmap resizeImage(Bitmap bitmap, Context ctx){ DisplayMetrics displayMetrics = new DisplayMetrics(); ((Activity)ctx).getWindowManager() .getDefaultDisplay() .getMetrics(displayMetrics); int width = displayMetrics.widthPixels; return Bitmap.createScaledBitmap(bitmap, width, width, true); } //read text from image using Firebase ML kit api //on-device api public static void readTextFromImage(Bitmap bitmap, final TextView textView){ FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap); FirebaseVisionTextRecognizer textRecognizer = FirebaseVision.getInstance() .getOnDeviceTextRecognizer(); Task<FirebaseVisionText> result = textRecognizer.processImage(image) .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() { @Override public void onSuccess(FirebaseVisionText firebaseVisionText) { textView.setText(firebaseVisionText.getText()); } }) .addOnFailureListener( new OnFailureListener() { @Override public void onFailure(@NonNull Exception e) { } }); } }

Image Text Reader Activity

This activity is started when an image in the recycler view is clicked. The selected image path is passed in the intent.

public class ImageTextActivity extends AppCompatActivity { @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_image_text); TextView textView = findViewById(R.id.image_text); //get the selected image path from the intent String imagePath = getIntent().getStringExtra("IMAGE_PATH"); //get upright image Bitmap bitmap = ImageTextReader.getUprightImage(imagePath); //extract text from image using Firebase ML kit on-device or cloud api ImageTextReader.readTextFromImage(bitmap, textView); } }

Image Text Reader Activity Layout