Introduction

My son is 8 years old and he has shown a lot of interest in cars, which is strange because I have zero interest in cars. But he is driving me crazy when we have a car ride: “dad is that an Peugeot?“, “dad, that is an Audi” and “that is a BMW, right?“, “That is another cool BMW, why don’t we have a BMW?“. He is pretty accurate, close to 100%! I was curious how accurate a very simple model could get. Just a re-use of a pre-trained image model approach on my laptop without any GPU’s.

Image Data

There is a nice python package google-images-download to help you download certain images.

from google_images_download import google_images_download response = google_images_download.googleimagesdownload() arguments = { "keywords" : "BMW,PEUGEOT" , "print_urls" : False, "suffix_keywords" : "car" , "output_directory" : "TMP" , "format" : "png" } response.download( arguments )

The above code will get you images of BMW’s and Peugeots, the problem though is that not all images are actually cars. You’ll see scooters, navigation systems and garages. Moreover, some downloaded files do not open at all.

So first, we can use a pre-trained resnet50 or vgg16 image classifier and run the downloaded files through this classifier and keep only the images that keras can open and were classified as car or wagon. Then the images are organized in the following folder structure

├── training │ ├── bmw (150 images) │ └── peugeot(150 images) └── validation ├── bmw (50 images) └── peugeot(50 images)

Predictive model

I am using the most simple approach, both in terms of modeling and computational effort. It is described in section 5.3 of this fantastic book “Deep Learning in R” by François Chollet and J. J. Allaire.

Take a pretrained network, say VGG16, remove the top so that you only have a convolutional base.

Now run your images trough this base so that each image is a tensor.

Treat these tensors as input for a complete separate neural network classifier. For example a simple one hidden fully connected layer with 256 neurons, shown in the code snippet below.

model <- keras_model_sequential() %>% layer_dense( units = 256, activation = "relu" , input_shape = 4 * 4 * 512 ) %>% layer_dropout(rate = 0.5) %>%

layer_dense(units = 1, activation = "sigmoid" )

The nice thing is that once you have put your images in the proper folder structure you can just ‘shamelessly’ copy/paste the code from the accompanying markdown of the book and start training a BMW-Peugeot model.

Conclusion

After 15 epochs or so the accuracy on the validation images flattens of to around 80% which is not super good and not even close to what my son can achieve 🙂 But it is not too bad either for just 30 minutes of work in R, mostly copy pasting code….. Cheers, Longhow.