Step 2: Installing the required dependencies

Before we go ahead and write any code, it’s important that we first have all the required dependencies installed on our development machine.

For the current example, these are the dependencies we’ll need:

tensorflow==1.13.1

pathlib

opencv-python

We can use pip to install these dependencies with the following command:

pip install dependency_name

Note: While not mandatory, it’s strongly suggested that you always use a virtual environment for testing out new projects. You can read more about how to set up and activate one in the link here:

Step 3: Loading the model and studying its input and output

Now that we have the model and our development environment ready, the next step is to create a Python snippet that allows us to load this model and perform inference with it.

Here’s what such a snippet might look like:

Here, we first load the downloaded model and then get the input and output tensors from the loaded model.

Up next, we print the input and outputs tensors we obtained earlier.

If you run the code, this is what the output might look like:

[{'name': 'normalized_input_image_tensor', 'index': 596, 'shape': array([ 1, 512, 512, 3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.0078125, 128)}] [{'name': 'TFLite_Detection_PostProcess', 'index': 512, 'shape': array([], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:1', 'index': 513, 'shape': array([], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:2', 'index': 514, 'shape': array([], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:3', 'index': 515, 'shape': array([], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}]

Unlike the output of the classification model, this output looks like a bit too much to process! But let’s go through it nevertheless.

Looking at the input tensor, we see that it has a single entry that takes in an RGB image of size 512 x 512 as its input at index 596 .

Conversely, the output tensor has 4 entries, which means that unlike the previous case where we got a single-element array, here we have 4 elements in the output array.

The bounding boxes for the object that we need, along with their confidence scores, will be in two of these 4 elements. Typically, the output elements are ordered by the array of rectangles followed by the array of scores for these rectangles.

After using some trial and error, I identified that the element named TFLite_Detection_PostProcess contains my rectangles, and the element named TFLite_Detection_PostProcess:2 contains the scores of these rectangles.

The element named TFLite_Detection_PostProcess:3 contains the total number of detected items and the element TFLite_Detection_PostProcess:1 contains the classes for the detected elements.

In our current case, printing the output of TFLite_Detection_PostProcess:1 should print an array of zeros.

However, if you have trained an object detection to detect multiple objects; this element might have different outputs for you.

For example, here’s a sample output of this node for an object detection model trained to detect 2 objects:

[

[0. 0. 0. 1 . 1. 0. 0. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]

]

Over here, if a particular index has the value 0; then the box and score at that particular index belong to the first object and if it has the value 1; then the box and score at that index belong to the second object.

These values might increase if you have trained your model to detect more objects.

In the next step, we’ll pass an image to the model and see the output for these outputs in action.