Artificial Intelligence (AI) and Machine Learning are one of the trending topics of the business. Google, a leader in this field, has developed a set of tools for developers that allow them to create new user experiences with infinite opportunities. Today, we will explore one of this tool and we’ll see how we can integrate it in our Android apps: Google Cloud Vision API.

Google Cloud Vision API

Cloud Vision API is an interesting API which allow developers to analyze content and contextual data associated with images, leveraging a self-trained machine learning model in continue evolution, all in a simple REST API. Thanks to this API, we can obtain contextual information on a given image, we can classify images in categories and sub-categories, reaching a deep level of detail of the information.

Let’s take this image for example:

Vision API is pretty amazing: it not only can understand the main subject of this photo (an animal) but can also recognize what kind is it (a dog) along with its breeds (a beagle). Moreover you can also obtain additional data about the grass and moutains in background.

Let’s see all the features of Google Cloud Vision API:

Label Detection : Detect a set of categories within an image (the example above)

: Detect a set of categories within an image (the example above) Explicit Content Detection: Detect if there are explicit content (adult/violent) within an image

Detect if there are explicit content (adult/violent) within an image Logo Detection: Detect popular logos within an image

Detect popular logos within an image Landmark Detection: Detect natural and man-made structures within an image

Detect natural and man-made structures within an image Optical Character Recognition: Detect and extract text within an image, it detects the even the language of the text

Detect and extract text within an image, it detects the even the language of the text Face Detection: Detect multiple faces within an image, along with other attributes like emotional state or wearing headwear

Detect multiple faces within an image, along with other attributes like emotional state or wearing headwear Image Attributes: Detect general attributes of the image, such as dominant colors and appropriate crop hints

In our example, we will use two of this features: Label Detection and Optical Character Recognition.

Now that we have the basis of the Vision API let’s see how to integrate it in our Android apps. We will create a sample project that will allow the user to select an image from the gallery and receive informations on it using this APIs. Let’s get into action!

Enabling Google Cloud API

To use this API, we must first enable it on Google Cloud Developer Console first. Let’s see how:

Create a project in Google Cloud Console or use an existing one Enable Billing for the project. If is the first time that you access to the Google Cloud Console you can start a free trial. It might ask you to add your credit card informations to validate your identity but will not charge you Enable the Google Cloud Vision API using this link Go to Credentials section from the side menu on the left Click on the credentials drop down menu and select OAuth Client ID:

Select application type as Android

Type an appropriate name (ex: Android Cloud Vision API)

Enter your SHA1 fingerprint (if you don’t have it, or don’t know how to generate it, type this command in the terminal to generate it: keytool -exportcert -keystore path-of-your-keystore -list -v )

(if you don’t have it, or don’t know how to generate it, type this command in the terminal to generate it: ) Enter the package name of your app: It must be the same as the one declared in the build.gradle of your app, under the key applicationId . In my case, is com.lpirro.cloudvision

Done! We are ready to start. Let’s start coding.

Cloud Vision API in Action

Create a new project in Android Studio (keep in mind: the package name must match to the one in the Google Cloud Developer Console project). Once done, open the build.gradle and add the dependencies of Vision API.

Create a new project in Android Studio (keep in mind: the package name must match to the one in the Google Cloud Developer Console project).

Now, open the AndroidManifest.xml and add the necessary permissions to make network calls and for getting account informations (necessary for the OAuth request)

We can now create the activity that will allow us to select an image from gallery and call the Cloud Vision services to obtain informations about the image.

The layout file of our activity is very simple: we have one ImageView used for displaying the selected image from the gallery, two TextView to show the results and one Button used to pick an image from the gallery

The layout file of our activity:

With our layout defined, let’s focus on the Activity. In this example, we will use the Google API Client library for java, and, because we are using the OAuth request, we need to obtain the auth token from Google first. So, let’s define a class that will allow us to get this token.

Note: For simplicity we will use an AsyncTask for network operations, but, if you will use this API in a real project, use a library like Retrofit, maybe along with RxJava ;)

Now we have all the necessary informations to call the Cloud Vision API from our app and receive the results

with the method setType() we define the type of the feature we want to use, in our case we use LABEL_DETECTION and TEXT_DETECTION . The format of the image passed to the API is in the Base64. Once the results are retrived, they are passed to the getDetectedText() method which will format the string and filter the informations, then we can finally display them on the UI.

Both AI and Machine Learning are quickly becoming the foundation for the digital transformations of the coming years; with the introduction of Cloud Vision API Google is now offering a premier tool to integrate both of them into everyday workflow both as users and as developers. Right now, the same technology we have seen above is already part of Google’s main products like Photos, used as aid to organize and classify our collection of memories. With the general availability of these tools, thousands of products will be able to integrate this amazing technology to make our everyday experience with technology more seamless and fun.

You can also check the full example on my GitHub repo below.