When I first saw Instagram’s and Snapchat’s filter I thought they were all magic.

Later I came to know that it is powered by AI and 3D CGI. But that still doesn’t explain much, right?

In order to build a filter you need to do 3 things:

○ Find the face ○ Put stuff on the face ○ Add color to the effect

So lets dig into it!

Find the face

What I mean by find the face: Locate it’s position and rotation in three dimensions. If you look around you will probably see this refered as defining the head pose with 6 degrees of fredom

The approach I used is the one based on this blog post and it goes like this:

○ Locate certain keypoints (nose tip position, left eye position, etc…) in the image. ○ Given an approximated 3D representation of the face, solve the Perspective-n-Point and get the face’s rotation and translation in 3D.

Locate keypoitns

AWESOME library called For this task I’m using anlibrary called face-api.js . You give it an image or a video and it will return a list of where are 68 keypoints on a human face.

Face-api.js works like a charm!

The way it works is best explained at the project’s page but in short:

○ Find where in the image the face is (the blue square on the right side of the gif), this is done using Tensorflow to run the image through a Neural Network. ○ Now that you have only the cropped face apply it to another Neural Network, this one will output positions for the keypoints.

Update

Solve Perspective-n-Point

Given where the keypoints are we can use an estimated 3D model of the human face and try to rotate and move it around so that it’s projection would be the same as the one observed.

We need a list of the 3D points that correspond to the 2D ones observed in the image, we don’t actually need a 3D model at all.

But, of course, having this 3D model makes our life easier because it’s now a matter of measuring it and getting these 3D points.

I moved a cube to the desired points and the copied and pasted the location Blender (or any other 3D modelling software) would tell me the object is.

Getting the points with Blender

We would also need to know some parameters about the camera (focal length, center of projection, etc) but we can just approximate them and it works great.

Now feed your 3D points and 2D points to something like OpenCV’s solvePnP and you’re done. It will give you a rotation value and translation values that when applied to the object in 3D would produce the same projection.

The only problem I got using this approach was that currently compiling OpenCV to WASM would produce a binary blob of ~1MB and 300k of JS after spending a whole day trying to decrease this size (it started at around 4MB).

I didn’t want to download and parse all of this just to run one function on my client’s mobile phone.

Put stuff on the face

Great! We now know the rotation and translation to apply to whatever we want to draw over the face.

So let’s do it! This couldn’t be easier.

We use three.js to create a scene, camera and an object.

Then we apply the rotation and translation given in the previous step to this object:

export const onResults = ( q : THREE . Quaternion , x : number , y : number , z : number , ) => { threeObject . rotation . setFromQuaternion ( q ) ; threeObject . position . set ( x , y , z ) ; } ;

We should match three.js’ configuration of FOV to the same as the camera where the picture was taken.

But since we don’t know it exactly using an approximation is fine.

Using 45 degrees works fine if the video is squared. Otherwise it will need to be corrected given the image aspect ratio.

Add colors to the effect

Once again, three.js comes to the rescue.

See it in action

Doubts?