Image Processing

For the first part of this article, I will cover how to setup a Raspberry Pi 3 with the camera module and how to use it to follow a target in real time.

Hardware required:

Raspberry Pi 3

Raspberry Pi Camera (I have version 1.3)

Configuration

First we need to install Open CV in order to apply the image algorithms we need. I followed this guide which was very helpful for the setup:

http://www.pyimagesearch.com/2016/04/18/install-guide-raspberry-pi-3-raspbian-jessie-opencv-3/

Once you have sucessfully installed opencv and installed the camera in your raspberry pi, we will discuss our problem.

Problem

I want to detect if there is a marker on the camera feed.

Solution

There are many different solutions that varies on research and complexity like the following:

Motion detection with camera moving

Feature matching in a camera feed

Template matching

As this was part of a semester project, it didn’t require much complexity, so I decided to go with template matching, as using feature matching would require more computation from the pi.

Here I will discuss how to detect a marker (could be any image), in the camera feed of your raspberry pi. In order to achieve this, I researched about template matching. Template matching will detect whether or not an image is part of another image.

This method works correctly on a single image, but we are required to use this on a video stream.

On a video stream we have to consider the following:

Each frame of the video stream will be consider as an image

The image can contain the marker from far away and will look smaller or it will very close and it will look bigger, while our marker doesn’t change size.

The use of template matching in each frame is a lot of processing if it’s done excessively affecting the video stream.

First, I started following this tutorial. It helped me have the Pi to work with recognizing a marker on a static image.

Once that I had this working, I started to analyze each point we have to consider:

If I just send each frame to the existing template matching, it will find only the marker if the frame contains exactly the marker with the same size as the input marker.

If I scale each frame to different sizes and evaluate the new resized frames with the marker it will give me a bigger possibility to find the marker in the video stream, but that means that I will need to resize each frame of the video X times.

If I’m not careful with the amount of scaling options I will consider, it can affect the video stream as it delays each frame until the calculation it’s done.

After carefully consider each point, I opted for scaling down the marker instead of the image frame. Scaling down the marker allows me to precompute the different resized images that will be needed in order to do the template matching. This takes out the need to resize any image at the time of evaluating each video frame.

Once I’ve scaled the marker down, I will send each of the video frames to do a multi scale template matching using the resized markers and see if I can find any match. Despite we reduced the processing time of resizing images at running time, we still need to be careful to not consider many cases to scale as each of them will need to apply template matching. This is why I chose three main scale options(0,4, 0.7, 1) for this case obtaining very good results based on a threshold of 0.7.

Now, we will see how to translate this into code. First we will need to setup our environment variables:

We can observe that marker_path is the path to the image that I will be using as a marker. Line 2 could be avoided but I needed it because the marker that I picked was too big.

We also define the threshold which will help us to determine whether or not the marker is found and lastly, declared storedMarkers where we will store the markers’ new resized images with their appropiate scaling option.

The method storingMarkerScale will take care of resizing our selected marker image and store them in our storedMarkers dictionary.

We will have multiScaleTemplateMatching taking each video frame. In this method we are taking each frame and convert it to a gray frame. After that, it will loop our selected scales options and do the template matching with each of them. If there is a match in one of the resized markers, it will exit the loop and draw a rectangle over it.

Finally, we have the startCamera method that takes care of setting the camera options and starting to grab and show each camera frame. Note that per each frame we are applying our multiScaleTemplateMatching method.

The last two lines start everything. It pre computes the marker resize images and then starts the camera video stream. Once you have finished, we have how to detect a marker in the video stream of your raspberry pi.