Image Courtesy of Michael Vadon via Flicker and DreamsTime

After completing Lesson One of the Fast.ai online Deep Learning course, I decided to swap out the original “dogs vs. cats” dataset for one containing pictures of two sisters, in this case, Tiffany and Ivanka Trump. My aim was to use the lesson’s sample code, which utilized a Convolutional Neural Network (CNN), to create a binary image classifier.

To me, this project exemplifies one of the beauties of the Fast.ai approach to teaching Deep Learning, which is that most of the lessons feature sample code which has “plug-and-play” functionality. You could easily use Lesson One to build a classifier with the aim of differentiating any two classes of images. For example a classifier that would learn to differentiate between yourself and all other redheads, like the one Scott Vogel outlined building in this post.

The project is also indicative of the “Whole Game Approach” teaching methodology, popularized by Harvard Graduate School of Education professor David Perkins, that Fast.ai founders Jeremy Howard and Rachael Thomas have found inspiration in. The method, having as it’s core, the aim of teaching people how each lesson solves a real world problem, as opposed risking getting students lost in rabbit holes of theory and minutia. In line with this approach, part of the teaching methodology of Fast.ai is to take each lesson’s sample code and “make it your own”, by altering and turning it into a project of your own.

I tried to follow a couple of parameters when downloading the data from Google. First, I aimed to obtain images of each subject that did not face sideways (no profile images). Also, I tried to favor images that had fewer subjects in the frame. So, for example, I applied preference to images with just Ivanka in the frame, as opposed to shots containing both Ivanka and Jarred Kushner. That having been said, I did not wish to spend a lot of time curating images, so I did snag several images with multiple subjects.

Since I was not using facial recognition, I was curious to know what features the Neural Network would hone in on in for each subject. For Tiffany, this was not obvious to discern during the Exploratory Data Analysis portion of the process, however I did notice the following in regards to the images of Ivanka.

The Most Correctly Labeled Images of Ivanka

One of Ivanka’s striking features is her relatively long neck. The CNN clearly honed in on this, as three of the four most correctly labeled images of Ivanka display this feature.

The numbers which appear above each of the four headshots of Ivanka in the image above allow us to see our binary classifier in action. The activation function that is utilized in the last layer of the CNN “smushes” the output label it applies to each image onto a scale from zero to one. In our example, correctly labeled Ivankas would get a label between zero and 0.5, while correctly labeled Tiffanys would receive output labels greater than 0.5, but less than one.

Confusion Matrix Representing and Accuracy of 95.7%

Let’s look at the most incorrect Ivankas:

We’ll notice that the image on the right contains Jared Kushner, so the CNN has to deal with the extra noise of having another face in the frame to consider. Why the image on the left was incorrectly classified is a bit more of a mystery.

And now for the Tiffanys:

The image on the left is a bit low resolution and fairly pixelated. The image on the right has both subjects in the same frame, with Ivanka in front of her sister, so it is no surprise that the CNN got confused.

So in our test set, only 4 out of 94 images where incorrectly classified. That is pretty impressive on a number of levels. First off, considering the relatively small dataset we used. The training set we used contained roughly 190 images of each subject, while our test set contained 47 images of each sister. By comparison the original Cats vs. Dogs example contained over 25,000 total images!

Secondly, many of the images contained multiple subjects in a single frame. We might improve this project down the line by incorporating Multi-Object Classification, which would allow an image to be identified as containing both a Tiffany and an Ivanka. Considering the fact that we did not employ either Facial Recognition or Multi-Object Classification, the incredibly high level of accuracy achieved (95.7%) is quite impressive.

And all this was accomplished just by completing of Fast.ai Lesson One!