Neat, isn’t it?

Plus you can also use this to detect human emotions like happy, sad, angry, etc. which is even better.

Enough talk, show me the code!!!!!

Alright, Alright but let’s quickly go through some basics things first.

ML Kit has 5 APIs in total so far each of them being :

Text Recognition Face Detection Barcode Scanning Image Labelling (The one we are going to use) Landmark Recognition

While each of the APIs have their own usecase, we will be using Image Labelling API in this article.

This API will give us list of the entities that were recognized from the image: people, things, places, activities, and so on.

There are 2 more sub type of this API, first is the On Device API which runs this labelling on the device itself. It is free and it covers 400+ different labels in the image.

Second is the Cloud API which runs on Google Cloud and covers 10,000+ different labels.

It’s paid but the first 1,000 requests per month are free.

In this article we will cover the On Device API since it won’t involve setting up your billing for Google Cloud.

But the sample code I will provide contains the code for both of them.

So without any further ado, let’s get started.

Setup Firebase in your project and add the vision dependency

This is a simple one, simply setup firebase in your project. You can find a good tutorial here.

In order to use this API, you also need to add the relevant dependencies.

Implement Camera functionality in your app

The Vision API needs an image to extract the data from, so either create an app that lets you upload images from the gallery or create an app that uses the Camera APIs to click a picture and use it instead.

I found this library to be pretty handy and easy to use instead of the framework Camera APIs so this is what I end up using.

The Vision API needs an image to extract the data from, so either create an app that lets you upload images from the gallery or create an app that uses the Camera APIs to click a picture and use it instead. I found this library to be pretty handy and easy to use instead of the framework Camera APIs so this is what I end up using. Use the Bitmap to make the API call

If you used the library above, it directly provides you with a Bitmap of the captured image which you can use to make an API call.

In the above code snippet, we first create a FirebaseVisionImage from the bitmap.

Then we create an instance of the FirebaseVisionLabelDetector which goes through the FirebaseVisionImage and finds the appropriate FirebaseVisionLabels (objects) it notices in the supplied image.

Lastly we pass the Image to the detectInImage() method and let the detector label the Image.

We have a success and a failure callback which contains a list of labels and an exception respectively.

You can go ahead and loop through the list to get the Name, Confidence and Entity ID for every label that was detected in the image.

As mentioned earlier, this API can also be used to detect human emotions from the Image, which can be seen from the screenshots below :