Overview

To know which celebrity is most similar to ourselves, we have to perform the following steps. These steps will be explained in more detail throughout the article:

Load in an image (an image of yourself or a random image) Extract the face out of the image Perform preprocessing so that the face image can be recognized. Load the image in the deep neural network “ResNet-50”. Extract the results

Face Recognition

Before we get into the coding, we need to have a high level introduction of some concepts.

Face recognition is the process of identifying people from images. Thanks to recent developments in the machine learning field, we are now able to let the model itself learn which features it should extract from the images.

Convolutional neural network

Deep convolutional neural networks are currently dominating the image classification domain. The trend in the latest years is building deeper neural networks to solve more and more complex image classification tasks. Since the scope of this article is rather limited, we will just explain the intuition behind it.

The process goes as follows, we feed the network a face image. This face image will then be fed to multiple layers (the convolutional base). In the first layers, the basic features are detected: edges, corners,…. The middle layers detects parts of objects. In our case, they might learn to detect: eyes, noses, ears,… . In the last layers, they learn to recognize full objects, in different shapes and positions. Then based on all these detected features, the classifier makes a prediction.

Process of a convolution neural network

Resnet50

Resnet50 is a deep learning convolutional neural network that achieved state-of-the-art results on standard face recognition datasets. It used the VGGFace2 dataset which is a large-scale face dataset which contains over 3.31 million images of 9.9131 subjects. The main difference with a standard convolutional neural network is that it uses residual learning. This means that instead of trying to learn features, it tries to learn the residuals. A more extensive explanation can be found here.

Let’s get started!

Step 0: Install & load the necessary packages

First, we need to install the necessary packages. To be certain to have the same results as me, I suggest you run this code on the google colaboratory environment. A colaboratory notebook executes the code on Google’s cloud servers, meaning you can leverage the power of Google hardware regardless of the power of your machine. Best of all, it is completely free!

mtcnn package for detecting the faces from the images (source)

keras_vggface for the resnet50 cnn model (source)

tensorflow (source)

keras (source)

opencv (source)

!pip install mtcnn !pip install keras_vggface !pip install tensorflow !pip install keras !pip install opencv !pip install PIL

Import the installed packages as follows:

import mtcnn from mtcnn.mtcnn import MTCNN from keras_vggface.vggface import VGGFace from keras_vggface.utils import preprocess_input from keras_vggface.utils import decode_predictions import PIL import os from urllib import request import numpy as np import cv2 # Import this one if you are working in the google colab environment from google.colab.patches import cv2_imshow

Step 1 : Image loading

For the sake of this exercise, we import a picture of Channing Tatum. This makes it easily reproducible for everyone and also proves the accuracy of the model. Change the url link with a image link of yourself (hint: you could use the image url of your linkedin account).

# Give the image link url = "https://upload.wikimedia.org/wikipedia/commons/thumb/8/8d/Channing_Tatum_by_Gage_Skidmore_3.jpg/330px-Channing_Tatum_by_Gage_Skidmore_3.jpg" # Open the link and save the image to res res = request.urlopen(url) # Read the res object and convert it to an array img = np.asarray(bytearray(res.read()), dtype='uint8') # Add the color variable img = cv2.imdecode(img, cv2.IMREAD_COLOR) # Show the image cv2_imshow(img)

Desired output of step 1 (img source)

Step 2: Face detection

The image is now loaded in. Since our model only needs the faces, we need to extract the face out of the image. For this we use the mtcnn package. The MTCNN achieves state-of-the-art results and is capable of detecting various face features (eyes, mouth,…). More information about MTCNN this can be found here.

# Initialize mtcnn detector detector = MTCNN()

We then determine some face extraction parameters:

Target size: What size should the face image have? For the ResNet50 images are required to have the size (224,224)

Border_rel: This parameter determines how zoomed in we want the face image to be. We set this to zero for now.

# set face extraction parameters target_size = (224,224) # output image size

border_rel = 0 # increase or decrease zoom on image

We call the detector to detect the face in the image given. We see that the face was detected with 99,8% probability. The key points are assigned coordinates.

# detect faces in the image detections = detector.detect_faces(img) print(detections)

Output of MTCNN detection

By using the box variable, we can determine the coordinates of the face.

x1, y1, width, height = detections[0]['box'] dw = round(width * border_rel) dh = round(height * border_rel) x2, y2 = x1 + width + dw, y1 + height + dh face = img[y1:y2, x1:x2]

After selecting the face in the image, we resize our face to the required format for ResNet-50: 224,224.

# resize pixels to the model size face = PIL.Image.fromarray(face)

face = face.resize((224, 224))

face = np.asarray(face) # show face cv2_imshow(face)

The extraced face of Channing Tatum (img source)

Step 3: Preprocessing

The ResNet-50 model expects to have multiple images as input. Currently, we have only one image. To solve this discrepancy, we expand with 1 dimension so we have a 1 x 224 x 224 x 3 shape instead of a 224 x 224 x 3 shape.

# convert to float32

face_pp = face.astype('float32')

face_pp = np.expand_dims(face_pp, axis = 0)

Machine learning models needs to be fed with consistent data. To make sure that all our images are consistent, we have to apply a normalization function. Luckily we can use the preprocess_input function from Keras to normalize our face images. This function will normalize the pixel values ranging between 0 and 255 to make them suitable for deep learning. Don’t forget to set the version parameter to “2” to have the specifically preprocessing for ResNet-50.

face_pp = preprocess_input(face_pp, version = 2)

Step 4: Predict

All the preparation is now done. Great job!

Here, we initialize our ResNet50 model by calling the VGGFace instance. We double check what inputs the model expect and what outputs the model will give.

# Create the resnet50 Model model = VGGFace(model= 'resnet50') # Check what the required input of the model is & output print('Inputs: {input}'.format(input = model.inputs)) print('Output: {output}'.format(output = model.outputs))

Input & Output for the ResNet-50 model

Next, we let our model predict the preprocessed face.

# predict the face with the input prediction = model.predict(face_pp)

Step 5: Extract results

The face has now been predicted. We just need to decode our predictions and print it out in a readable manner.

# convert predictions into names & probabilities results = decode_predictions(prediction) # Display results cv2_imshow(img)

for result in results[0]:

print ('%s: %.3f%%' % (result[0], result[1]*100))

Predictions results for Channing Tatum (img source)

And there you go!

You just performed a prediction with the help of a convolutional neural network. If you performed this with the Channing Tatum image, don’t forget to try it out with an image of yourself to see with which celebrity you have the biggest similarities.

Next steps

In a next step, we could apply transfer learning. Transfer learning refers to the process of re-using a pre-trained model of one problem and applying it on a different problem. We could use this state-of-the-art ResNet-50 model and re-train the classification layer on a different set of people.

Sources

ResNet50 paper

Convolutional neural network in keras

Github

The full code can be found on my github. I would suggest using google collab when executing the code.