Development projects that involve machine learning are no different. This means our data scientists must be discerning when they’re selecting tools and algorithms. Always, their choices should provide the client with the optimal ratio of time spent versus algorithm performance.

In the case of Hop, a current client, we were tasked with developing a facial recognition engine to verify customer identities for automated draft beer distribution kiosks. The client had a strong preference for AWS Rekognition due to some research he performed prior to our engagement.

Rekognition is an extremely powerful platform, and it promises scalability and functionality that would circumvent countless data science and software engineering hours. It seemed like a good fit, so we took the time to build an experiment in our favorite experimentation environment, Jupyter Notebooks.

First Impressions

The purpose of this system is Identity Verification, which means that when the system is queried, we’ll essentially be asking, "Does this face belong to the person whose identification credentials have been presented?” For this task, the Compare Faces feature of Rekognition is needed.

To get an initial feeling for the performance of the Compare Faces feature, we did a simple experiment with 105 pairs of photos of an extraordinarily un-photogenic person: me. The results are illustrated in the histogram below.

‍Underwhelmed: Percent match on image pairs of the same person

Though this was an initial exploratory analysis, we can see that there is a high variance, and a large portion of the distribution lies below 80% match. This is especially troubling because by default, AWS only returns matches of greater than 80%, which would lead us to believe that 80% is the minimum threshold for a "match.”

Next, we performed the inverse of this test: 100 pairs of photos of myself and people who are not me.

‍High Precision: Percent match on image pairs of different people

Final Observations