Cool story, let’s finally see it in action!

Okay, as I said I initially failed to solve this task with OpenCV. Now I have a bunch of 150x150 sized faces of Sheldon, Raj, Lennard, Howard and Stuart sitting here. I will now show you how simple it is to use the data to train a Face Recognizer and to recognize new faces. The code of this example can be found on the repo.

Preparing the data

I have collected roughly 20 faces per character in different poses:

We will use 10 faces each to train the recognizer and the rest to evaluate the accuracy of our recognizer:

The file name of each face image contains the persons name so we can simply map our class names:

['sheldon', 'lennard', 'raj', 'howard', 'stuart']

to an array of images per class. You can read an image given the file path with fr.loadImage(fp).

Detecting the faces

As I said the faces are already extracted with a size of 150x150 each, which I have done with opencv4nodejs beforehand. But you can also detect and extract the faces, save and label them as follows:

Training the Recognizer

Now that we have our data in place we can train the recognizer:

Basically what this does is feeding each face image into the neural net, which outputs a descriptor for the face and store all the descriptors for the given class. You can also jitter the training data by specifying numJitters as a third argument, which will apply transformations such as rotation, scaling and mirroring to create different versions of each input face. Increasing the number of jittered version may increase prediction accuracy but also increases training time.

Furthermore, we can store the recognizers state, so that we do not have to train it again the next time and we can simply load it from a file:

Save:

Load:

Recognizing new faces

Now we can check the prediction accuracy for our remaining data and log the results:

Currently prediction is done by computing the euclidean distance of the input face’s descriptor vector to each descriptor of a class and a mean value of all distances is computed. One could probably argue that kmeans clustering or an SVM classifier would be better suited for this task and I might implement these in future as well. But for now using euclidean distance seems to be fast and efficient enough.

Calling predictBest will output the result with the lowest distance e.g. the highest similarity. The output will look somehow like this:

{ className: 'sheldon', distance: 0.5 }

In case you want to obtain the distances of the face descriptors of all classes to an input face you can simply use recognizer.predict(image), which will output an array with the distance for each class:

[

{ className: 'sheldon', distance: 0.5 },

{ className: 'raj', distance: 0.8 },

{ className: 'howard', distance: 0.7 },

{ className: 'lennard', distance: 0.69 },

{ className: 'stuart', distance: 0.75 }

]

Results

Running the above example will give the following results.

Using 10 faces each for training:

sheldon ( 90.9% ) : 10 of 11 faces have been recognized correctly

lennard ( 100% ) : 12 of 12 faces have been recognized correctly

raj ( 100% ) : 12 of 12 faces have been recognized correctly

howard ( 100% ) : 12 of 12 faces have been recognized correctly

stuart ( 100% ) : 3 of 3 faces have been recognized correctly

Using only 5 faces each for training:

sheldon ( 100% ) : 16 of 16 faces have been recognized correctly

lennard ( 88.23% ) : 15 of 17 faces have been recognized correctly

raj ( 100% ) : 17 of 17 faces have been recognized correctly

howard ( 100% ) : 17 of 17 faces have been recognized correctly

stuart ( 87.5% ) : 7 of 8 faces have been recognized correctly

And here is what happens when we run face recognition on a video stream:

Conclusion

Looking at the results, we can see that even with using a small set of training data, we can already obtain pretty accurate results. Even though some of the extracted faces are very blurry because of the small size of the images I scraped from the web.

If you liked this article you are invited to give this npm package a try. Also I would highly appreciate leaving a star on my github repository as well as any kind of feedback. :)