In our previous tutorial, we learned how to use models which were trained for Image Classification on the ILSVRC data. In this tutorial, we will discuss how to use those models as a Feature Extractor and train a new model for a different classification task.

Suppose you want to make a household robot which can cook food. The first step would be to identify different vegetables. We will try to build a model which identifies Tomato, Watermelon, and Pumpkin for this tutorial. In the previous tutorial, we saw the pre-trained models were not able to identify them because these categories were not learned by the models.

Transfer Learning vs Fine-tuning

The pre-trained models are trained on very large scale image classification problems. The convolutional layers act as feature extractor and the fully connected layers act as Classifiers.

Since these models are very large and have seen a huge number of images, they tend to learn very good, discriminative features. We can either use the convolutional layers merely as a feature extractor or we can tweak the already trained convolutional layers to suit our problem at hand. The former approach is known as Transfer Learning and the latter as Fine-tuning.

As a rule of thumb, when we have a small training set and our problem is similar to the task for which the pre-trained models were trained, we can use transfer learning. If we have enough data, we can try and tweak the convolutional layers so that they learn more robust features relevant to our problem. You can get a detailed overview of Fine-tuning and transfer learning here. We will discuss Transfer Learning in Keras in this post.

ImageNet Jargon

ImageNet is based upon WordNet which groups words into sets of synonyms (synsets). Each synset is assigned a “wnid” ( Wordnet ID ). Note that in a general category, there can be many subcategories and each of them will belong to a different synset. For example Working Dog ( sysnet = n02103406), Guide Dog ( sysnet = n02109150 ), and Police Dog ( synset = n02106854 ) are three different synsets.

The wnid’s of the 3 object classes we are considering are given below

n07734017 -> Tomato

n07735510 -> Pumpkin

n07756951 -> WaterMelon

Download and prepare Data

For downloading Imagenet images by wnid, there is a nice code repository written by Tzuta Lin which is available on GitHub. You can use this to download images of a specific “wnid”. You can visit the GitHub page and follow the instructions to download the images for any of the wnid’s.

However, If you are just starting out and do not want to download full-size images, you can use another python library available through pip – imagenetscraper. It is easy to use and also provides resizing options. Installation and usage instructions are provided below. Note that it works with python3 only.

Download Code To easily follow along this tutorial, please download code by clicking on the button below. It’s FREE! Download Code

# Install imagenetscraper pip3 install imagenetscraper # Download the images for the three wnids and keep them in separate folders. imagenetscraper n07756951 watermelon imagenetscraper n07734017 tomato imagenetscraper n07735510 pumpkin

I found that the data is very noisy, i.e. there is a lot of clutter, the objects are occluded etc. So, I shortlisted around 250 images for each class. We need to create two directories namely “train” and “validation” so that we can use the Keras functions for loading images in batches.

Load the pre-trained model

from tensorflow.keras.applications import vgg16 vgg_conv = vgg16.VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))



In the above code, we load the VGG Model along with the ImageNet weights similar to our previous tutorial. There is, however, one change – ` include_top=False . We have not loaded the last two fully connected layers which act as the classifier. We are just loading the convolutional layers. It should be noted that the last layer has a shape of 7 x 7 x 512.

Extract Features

The data is divided into 80:20 ratio and kept in separate train and validation folders. Each folder should contain 3 folders belonging to he respective classes. You can change the directory according to your system.

# each folder contains three subfolders in accordance with the number of classes train_dir = './clean-dataset/train' validation_dir = './clean-dataset/validation' # the number of images for train and test is divided into 80:20 ratio nTrain = 600 nVal = 150

We will use the ImageDataGenerator class to load the images and flow_from_directory function to generate batches of images and labels.

# load the normalized images datagen = ImageDataGenerator(rescale=1./255) # define the batch size batch_size = 20 # the defined shape is equal to the network output tensor shape train_features = np.zeros(shape=(nTrain, 7, 7, 512)) train_labels = np.zeros(shape=(nTrain,3)) # generate batches of train images and labels train_generator = datagen.flow_from_directory( train_dir, target_size=(224, 224), batch_size=batch_size, class_mode='categorical', shuffle=True)

Then we use model.predict() function to pass the image through the network which gives us a 7 x 7 x 512 dimensional Tensor. We reshape the Tensor into a vector. Similarly, we find the validation_features.

# iterate through the batches of train images and labels for i, (inputs_batch, labels_batch) in enumerate(train_generator): if i * batch_size >= nTrain: break # pass the images through the network features_batch = vgg_conv.predict(inputs_batch) train_features[i * batch_size : (i + 1) * batch_size] = features_batch train_labels[i * batch_size : (i + 1) * batch_size] = labels_batch # reshape train_features into vector train_features_vec = np.reshape(train_features, (nTrain, 7 * 7 * 512)) print("Train features: {}".format(train_features_vec.shape))

Output:

Train features: (600, 25088)

Create your own model

We will create a simple feedforward network with a softmax output layer having 3 classes.

from tensorflow.keras.layers import Dense, Dropout from tensorflow.keras import Sequential, optimizers model = Sequential() model.add(Dense(512, activation='relu', input_dim=7 * 7 * 512)) model.add(Dropout(0.5)) model.add(Dense(3, activation='softmax'))

Train the model

Training a network in Keras is as simple as calling model.fit() function as we have seen in our earlier tutorials.

# configure the model for training model.compile(optimizer=optimizers.RMSprop(lr=2e-4), loss='categorical_crossentropy', metrics=['acc']) # use the train and validation feature vectors history = model.fit(train_features_vec, train_labels, epochs=20, batch_size=batch_size, validation_data=(validation_features_vec, validation_labels) )

Check Performance

We would like to visualize which images were wrongly classified.

# get the list of all validation file names fnames = validation_generator.filenames # get the list of the corresponding classes ground_truth = validation_generator.classes # get the dictionary of classes label2index = validation_generator.class_indices # obtain the list of classes idx2label = list(label2index.keys()) print("The list of classes: ", idx2label)

Output:

The list of classes: ['pumpkin', 'tomato', 'watermelon']

Let’s evaluate the number of incorrect predictions:

predictions = model.predict_classes(validation_features_vec) prob = model.predict(validation_features_vec) errors = np.where(predictions != ground_truth)[0] print("Number of errors = {}/{}".format(len(errors),nVal))

We will get 14 falsely predicted classes out of 150 validation images:

Number of errors = 14/150

Let us see which images were predicted wrongly:

for i in range(len(errors)): pred_class = np.argmax(prob[errors[i]]) pred_label = idx2label[pred_class] print('Original label:{}, Prediction :{}, confidence : {:.3f}'.format( fnames[errors[i]].split('/')[0], pred_label, prob[errors[i]][pred_class])) original = load_img('{}/{}'.format(validation_dir,fnames[errors[i]])) plt.axis('off') plt.imshow(original) plt.show()

Below we can see several examples as the result of the above code execution:

We will try to improve on the limitations of transfer learning by using another approach called Fine-tuning in our next post.

References

Building powerful image classification models using very little data

Deep Learning with Python Github Repository

Subscribe & Download Code

If you liked this article and would like to download code and example images used in this post, please subscribe to our newsletter. You will also receive a free Computer Vision Resource Guide. In our newsletter, we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.

Subscribe Now