Let say, for some reason, you process a huge load of images that contain some sort of logo (or even products). And you want to know which images are related to a specific companies. This is very common problem in AdTech and/or Data Mining companies.

Let’s imagine you have a user generated content in your application, so you want to know what’s inside the images, and if you have ads running (Bingo!), you can target a user based on the images they liked, commented on, published etc. This can help you create an ad profile for your users, and the fact that the models are hosted on an API means you don’t have to ship an application with over 200MB in size.

Another use case could be if you have a search or recommendation feature in your application to ensure that you’re capturing the most out of the content, whether it’s straight text or images. In that case, you’d need to have an idea of what the image contains—hence, classification.

This is a look at what the final result will look like:

Final result

Transfer learning

Training a convolutional neural network from scratch is very expensive: the more layers that stack up, the higher the number of convolutions and parameters there are to optimize. The computer must be able to store several gigabytes of data and efficiently perform the calculations. That’s why hardware manufacturers are stepping up their efforts to provide high-performance graphics processors (GPUs) that can quickly drive a deep neural network by parallelizing calculations.

Transfer learning (or learning by transfer) allows you to do deep learning without having to spend a month doing compute-intensive calculations. The principle idea is to use the knowledge acquired by a neural network when solving a problem in order to solve another more or less similar problem. This is a transfer of knowledge, hence the name.

In addition to speeding up the training of the network, transfer learning makes it possible to avoid overfitting a model. Indeed, when the input image collection is small, it’s strongly discouraged to train the neural network from scratch (that is to say, with a random initialization): the number of parameters to learn ends up being much higher than the number of images, and so the risk of overfitting is huge!

Transfer learning is a technique that’s widely used in practice and is simple to implement. It requires having a network of neurons already trained, preferably on a problem close to the one we want to solve. Nowadays, we can easily retrieve one from the internet, especially in various deep learning libraries, such as Keras or PyTorch, that we’ll be using in this tutorial.

We can exploit the pre-trained neural network in a number of ways, depending on the size of the input dataset and its similarity to what’s used in pre-training.

Returning to our logo recognition project: The dataset

We’ll be using an open source dataset made by Flickr with 32 distinct company logos.

There are 320 logo images for training, 960 logo images for validation, 3960 images for testing, and 3000 no-logo images.

There’s no particular pre-processing needed for the images—they’re already pretty clean. Some would argue that it’s not enough to make a robust image classifier, and I do agree. That’s why we’ll be using transfer learning techniques discussed above.