The original paper implemented a bunch of different models, including classic ML models with handcrafted features and 3 deep learning models: AlexNet, ResNet18, and ResNext50.

I want to keep my work as simple as possible (I don’t want to implement and train the whole resnet network from scratch), I want to fine tune some existing model that will do the job. In keras , there’s a module called applications , which is a collection of different pre-trained models. One of them is resnet50 . Unfortunately, there’s no ResNet18 or ResNext50 in keras.applications so I won’t be able to reproduce exactly the same work, but I should be close enough with resnet50 .

from keras.applications import ResNet50

ResNet is a Deep Convolutional network that was developed by Microsoft and won the 2015 ImageNet competition, which is an image classification task.

When we initiate the resnet50 model in keras , we create a model with the ResNet50 architecture and also we download the trained weights as were trained on the ImageNet dataset.

The authors of the paper didn’t mention how exactly they trained the models, so I’ll try to do my best.

I want to remove the last layer (the “softmax” layer) and add a Dense layer with no activation function to perform regression.

resnet = ResNet50(include_top=False, pooling=’avg’)

model = Sequential()

model.add(resnet)

model.add(Dense(1)) model.layers[0].trainable = False print model.summary() # Output:

Layer (type) Output Shape Param # ================================================================= resnet50 (Model) (None, 2048) 23587712 _________________________________________________________________ dense_1 (Dense) (None, 1) 2049 ================================================================= Total params: 23,589,761

Trainable params: 23,536,641

Non-trainable params: 53,120

You can see that I made the first layer (the resnet model) non-trainable, so I have only 2,049 trainable params instead of 23,589,761.

My plan is to train the final Dense layer, and then, train the whole network with a smaller learning rate.

model.compile(loss='mean_squared_error', optimizer=Adam()) model.fit(batch_size=32, x=train_X, y=train_Y, epochs=30)

After that, I change the first layer to trainable, compile and fit the model for another 30 epochs.

Here train_X are the photos, i.e, numpy arrays of shape (350, 350, 3) , and train_Y are the scores of the images as were tagged.

Results

The paper trained the models using 2 techniques: 5-fold cross validation, and a 60%-40% train test split. They measured their results using Pearson Correlation (PC), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE). These are the results they got using the 5-fold cross validation:

And these are the results they got using the 60%-40% train-test split:

I‘ll’ do a 80%-20% train-test split, so it is similar to perform 1 fold of their cross validation part.

I got the following results:

RMSE: 0.301799791952313

MAE: 0.2333630505619627

PC: 0.9012570266136678

Pretty good. Also, it is always nice to look at the scatter plot and the histograms of the scores:

Original scores distribution (normalized):

Predicted scores distribution (normalized):

The results look pretty good. Now let’s see what this Deep Neural Network says about me. I used this photo at first:

I got 2.85, which means that I’m more attractive than 52% of people in this dataset. I have to say that I’m bit disappointed, I hoped that I’ll be better than that. So I tried to improve my situation.

I took a lot of photos and eventually with this one I got a score of 3.15, which means that I’m more attractive than 64% of the people in the dataset.

This is much better, I have to be honest and say that I hoped for better :)

One final note, I built and fine-tuned this model using the Google Colaboratory, in short, it gives you a python notebook that uses a GPU for free!

Hope you enjoyed the post.