August 07, 2018

There is an important technique in deep learning called Transfer Learning. It allows one to fine-tune a pretrained network on new data and repurpose it. This greatly reduces computation cost and data requirements. Consider a business problem where you needed a machine to recognize SpongeBob SquarePants characters. How could you quickly tackle this important pressing issue?

VGG19

The VGG19 is a very deep convolutional network for image recognition. It is a 19 layer network that was trained for the ImageNet Challenge in 2014 by the University of Oxford. This network can classify 1000 different objects so it’s a perfect baseline for our task. The following is a diagram of VGG19’s architecture:

Lets see how well it does in recognizing an image of SpongeBob SquarePants.

The network guessed the image to be a hook at 17.6656% confidence . As expected it fails because it wasn’t trained on SpongeBob SquarePants.

Methodology

To apply transfer learning, we need to perform the following steps:

Build a dataset Load a pretrained network Freeze a number of layers Add new layers Train the model

You can find my code and data for this on my github

Build a dataset

I built a very small dataset for this task. It consisted of 3 characters (classes):

SpongeBob SquarePants

Sandy Cheeks

Patrick Star

For each class, I had 31 images. 27 for training, 3 for validation, and 1 for final prediction.

Load a pretrained network

I used the VGG19 network for this task. The objects that this image was trained to recognize are real objects, not cartoons. I wanted to see if it could generalize to screenshots of cartoon characters.

Freeze a number of layers

Freezing layers means that the network will not train a given set of layers. We may freeze many of the layers if we don’t have sufficent data, or maybe those layers are already well-trained on a set of features. It took a few tries to find the right number of layers to freeze. My combinations were:

freeze all layers

freeze the first layer

freeze the first five layers

freeze the first ten layers

freeze all but the last ten layers

freeze all but the last five layers

freeze all but the last layer

My best result was freezing the first five layers.

Add new layers

The last few layers of VGG19 network are used to classify images into classes. We need to rip these out, and add our own. Mine were as follows:

fully connected layer outputing 1024 neurons

50% dropout layer

fully connected layer outputing 1024 neurons

fully connected layer outputing 3 neurons

Train the model

It was very easy to train the model. Because most of the work was already done, I was able to train all the freezing combinations above in under 30 mins. I used a batch size of 16, and trained until accuracy stopped increasing.

Predictions

The following are my results.

Freeze all layers

Freeze the first layer

Freeze the first five layers

Freeze the first ten layers

Freeze all but the last ten layers

Freeze all but the last five layers

Freeze all but the last layer

References

Transfer Learning using Keras

Keras Tutorial: Fine-tuning using pre-trained models

Keras

deeplearning.ai