In this tutorial you will learn how to build this webcam face filter :

You can try it yourself by clicking here!

Introduction

Some native mobile applications such Snapchat, Facebook Lenses or MSQRD have popularized webcam face filters. There are even creative studio softwares to easily create face filters without requiring special computer development knowledge. It’s still time for you to take the blue pill proposed by Morpheus by launching Facebook AR Studio and stopping this tutorial. Do you want to continue?

So take the red pill and persevere when we will deal with WebGL! Your face filter will no longer be imprisoned in the Matrix. You will be able to insert it everywhere thanks to the miracle of JavaScript.

Red or blue pill?

You can test the Matrix face filter here: LIVE DEMO

If you do not have a webcam, a video screenshot is available here: YOUTUBE VIDEO

You can download the final project from this tutorial here.

Get started

To complete this tutorial, you must have a local HTTP server installed. To deploy the project on different domain than localhost, you should host it on a secure HTTP server (HTTPS). Otherwise the webcam access won’t be allowed by the web browser. We start with an index.html file containing the following code:

<!DOCTYPE html> <html> <head> <script src='main.js'></script> </head> <body onload='main()' style='margin: 0px'> <canvas id='matrixCanvas' style='transform: rotateY(180deg);'></canvas> </body> </html>

The face filter will be rendered in the <canvas> element. The CSS transform property rotateY (180deg) flips the image horizontally to display it mirrored. We include the main.js script which contains the entry point function main() . It resizes the <canvas> to fill the screen:

function main(){ var cv=document.getElementById('matrixCanvas'); cv.setAttribute('width', window.innerWidth); cv.setAttribute('height', window.innerHeight); }

Webcam access and face detection

We use Jeeliz FaceFilter to access the webcam video feed, detect the face and its orientation. This library uses a deep learning neural network to detect from an image whether it is a face, the face rotation, its translation and the opening of the mouth. It fully runs client side, on the GPU through WebGL. Although it is possible to input a <video> element, we advise to access the user’s webcam through FaceFilter. Indeed, many polyfills and workarounds are applied to deal with the implementation errors of WebRTC across the different web browsers and devices.

We include FaceFilter in the <head> section of index.html :

<script src='https://appstatic.jeeliz.com/faceFilter/jeelizFaceFilter.js'></script>

And we initialize FaceFilter in main.js , at the end of the main() function:

JEEFACEFILTERAPI.init({ canvasId: 'matrixCanvas', //path of NNC.json, which is the neural network model: NNCpath: 'https://appstatic.jeeliz.com/faceFilter/', callbackReady: function(errCode, initState){ if (errCode){ console.log('AN ERROR HAPPENS BRO =', errCode); return; } console.log('JEEFACEFILTER WORKS YEAH !'); init_scene(initState); }, //end callbackReady() callbackTrack: callbackTrack }); function init_scene(initState){ //vide pour le moment }

We define the callbackTrack() function after the main function. It is called each time the detection loop is executed, approximately 50 times per second. Its argument is a dictionnary storing the result of the face detection, detectState :

function callbackTrack(detectState){ console.log(detectState.detected); }

Let’s test! Launch the code and open the web browser JavaScript console. Hide the camera using your hand. The value logged in the console, detectState.detected , should be around 0 . Then remove your hand from the webcam and stand in front of it. The logged valueshould climb to 1 because your face will be detected.

Dive into the third dimension

Now we will add the third dimension with WebGL. Do not panic, the 3D JavaScript engine THREE.JS will help us a lot and it won’t be so complicated!

Video display

The Matrix is a 2D flat world, so we need to add the third dimension. In index.html , we include in the <head> section the THREE.js 3D engine and the THREE.js specific FaceFilter helper:

<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/91/three.js"> </script> <script src="https://appstatic.jeeliz.com/faceFilter/JeelizThreejsHelper.js"> </script>

The helper will create the scene using THREE.js, convert the 2D coordinates of the face detection window into 3D coordinates of the scene, create an object by detected face and manage the position of the pivot point for each face. In main.js we fill the function init_scene ():

var THREECAMERA; function init_scene(initState){ var threeInstances=THREE.JeelizHelper.init(initState); //create a camera with a 20° FoV: var cv=initState.canvasElement; var aspecRatio=cv.width/cv.height; THREECAMERA=new THREE.PerspectiveCamera(20, aspecRatio, 0.1, 100); }

In the callbackTrack function, we render the scene by the THREECAMERA camera on the <canvas> element by replacing the console.log... by:

THREE.JeelizHelper.render(detectState, THREECAMERA);

Test your code. You should see the webcam video feed in full screen.

The raining code

In this section we replace the webcam video by the famous Matrix raining code video. The video from the webcam is currently an instance of THREE.Mesh created and added to the scene by THREE.JeelizHelper . In the init_scene function, after creating the camera we first initialize a video texture from the .MP4 file of falling lines of code:

var video=document.createElement('video'); video.src='matrixRain.mp4'; video.setAttribute('loop', 'true'); video.setAttribute('preload', 'true'); video.setAttribute('autoplay', 'true'); var videoTexture = new THREE.VideoTexture( video ); videoTexture.magFilter=THREE.LinearFilter; videoTexture.minFilter=THREE.LinearFilter; threeInstances.videoMesh.material.uniforms.samplerVideo.value=videoTexture;

The videoTexture.magFilter and videoTexture.minFilter parameters detail how to calculate the texel color of the texture based on the exact requested position. THREE.NearestFilter returns the color of the closest texel. It is faster but it can cause pixilation artifacts. THREE.LinearFilter specifies a linear interpolation between neighboring texels of the requested position.

Run the code. The famous video of the falling lines of code appears in full screen.

Now we have access to the 2 videos: the webcam feed and the raining code. We need to combine them using a head shaped 3D mesh and a specific shading. We will keep the video of the raining code as the background for the rest of the tutorial.

Importing the mask

We import a face mesh that will follow the head. We will later distort the raining code video using a specific rendering. We insert at the end of the init_scene function:

new THREE.BufferGeometryLoader().load('maskMesh.json', function(maskGeometry){ maskGeometry.computeVertexNormals(); var maskMaterial=new THREE.MeshNormalMaterial(); var maskMesh=new THREE.Mesh(maskGeometry, maskMaterial); threeInstances.faceObject.add(maskMesh); });

maskMesh.json contains a mesh created by exporting the maskMesh.blend 3D model provided in the tutorial ZIP archive using the THREE.js Blender exporter addon. This exporter is no longer included in the main THREE.js release, so we have included it in the FaceFilter github repository here. More information about export options is available here.

Using the THREE.js export plugin from Blender makes it possible to use meshes with optimal weight and fast to parse in JavaScript thanks to JSON.

The mesh is loaded and a callback function is executed asynchronously with an instance of THREE.BufferGeometry as argument. Firtly its normals are computed in order to be able to computed a coherent lighting. Then an instance of THREE.Mesh is created, maskMesh . A THREE.Mesh is a geometry rendered with a specific material at a given position in space. For debugging, we recommend to apply a THREE.MeshNormalMaterial material right after importing it. It will be easy to troubleshoot any import, topology, normal, positioning, or scale issue before focusing on rendering. Finally, we add the THREE.Mesh to the object following the detected face and returned by the helper by running: threeInstances.faceObject.add(maskMesh) .

Test your code. The face mesh must follow your face.

The mask follows the detected face and is hidden if the detection is lost. Nail it! We only have a little infographic job left!

The face mask material

About the shaders

We replace the material of the face by a material displaying a combination of:

the texture of the moved background (the raining lines of code distorted by the mask in 3D),

the webcam video texture with altered colors.

Ready to dive into WebGL ? Brace yourself, there will be GLSL!

We define this material by two programs executed by the graphic processing unit (GPU) and called the shaders:

the vertex shader is executed for each point of the mesh. It computes the coordinates of the point in the camera coordinate system. Then it project it on the rendering area (called the viewport) in 2D;

the fragment shader is executed at least once per pixel of the final rendering. It computes the color of the pixel. If antialiasing is enabled (default behaviour), it is called multiple times per rendered pixel near object borders. This refines the color of the edges by oversampling and therefore reduces aliasing.

Both shaders are declared in JavaScript in the form of GLSL code strings, a language whose syntax is close to C. With THREE.js, in most cases we do not need to deal with the shaders. Indeed, as soon as we create a material, a pair of shaders is automatically created, compiled and used to render the material. But in our case, no predefined material is appropriate and we must dive into the heart of the Matrix and declare our own shaders.

Our material will use two textures: the greenish texture of the raining code and the webcam video texture. This material will be an instance of THREE.ShaderMaterial , as it will be applied to an object inserted into the 3D scene. THREE.js has two types of materials requiring to specify the source code of the shaders:

THREE.RawShaderMaterial : this type of material has nothing predeclared, you will not find the usual 3D projection matrices ( projectionMatrix , modelViewMatrix …). This is useful for computing on GPU (GPGPU), postprocessing or specific applications;

: this type of material has nothing predeclared, you will not find the usual 3D projection matrices ( , …). This is useful for computing on GPU (GPGPU), postprocessing or specific applications; THREE.ShaderMaterial : this type of material already handles the 3D projection matrices and the default attributes of the points (normals, UV coordinates …). If you need to create a material to change the appearance of a 3D object in the 3D scene, it is better to use this solution.

Each shader must have a void main(void) function. It will be executed for each projected point int the vertex shader. And it will be executed for each rendered pixel in the fragment shader. This function does not return any value, but must assign a prebuilt variable:

gl_Position for the vertex shader which is the position of the point in clipping coordinates in the viewport,

for the vertex shader which is the position of the point in clipping coordinates in the viewport, gl_FragColor for the fragment shader which is the color of the pixel in normalized RGBA format (each component must be between 0 and 1 ).

Our first shader

We replace the line affecting maskMaterial by:

var maskMaterial=new THREE.ShaderMaterial({ vertexShader: "

\ void main(void){

\ #include <beginnormal_vertex>

\ #include <defaultnormal_vertex>

\ #include <begin_vertex>

\ #include <project_vertex>

\ }", fragmentShader: "precision lowp float;

\ uniform vec2 resolution;

\ uniform sampler2D samplerWebcam, samplerVideo;

\ void main(void){

\ vec2 uv=gl_FragCoord.xy/resolution;

\ vec3 colorWebcam=texture2D(samplerWebcam, uv).rgb;

\ vec3 finalColor=colorWebcam;

\ gl_FragColor=vec4(finalColor, 1.); //1 for alpha channel

\ }", uniforms:{ samplerWebcam: {value: THREE.JeelizHelper.get_threeVideoTexture()}, samplerVideo: {value: videoTexture}, resolution: {value: new THREE.Vector2( initState.canvasElement.width, initState.canvasElement.height)} } });

The shaders, vertexShader and fragmentShader are declared as GLSL code strings. Each line is terminated with a newline character,

then an extra \ to signify that it is a multi-line declaration of a string. Most of the material customization will occur in the fragment shader. In the vertex shader we use shader chunks. For example, #include <project_vertex> will be replaced by the string stored in THREE.ShaderChunk.project_vertex . This trick allows THREE.JS to reuse portions of code between several different kinds of materials.

The fragment shader fetches the texture color of the webcam video feed and stores it in colorWebcam . Then it displays this color on the viewport by assigning it to gl_FragColor . The variable gl_FragCoord is a read only pre-built GLSL variable storing the coordinates of the rendered pixel. By dividing its first two coordinates by the resolution of the viewport, we obtain the normalized UV texture coordinates between $0$ and $1$.

Uniform variables are used to pass values ​​from JavaScript to shaders. We use them for the two textures, as well as the resolution of the <canvas> in pixels. You can test the code again.

Our first custom shader!

Mask positionning

The maskMesh face mesh is incorrectly positioned relative to its parent object. That’s why the face appears offset from the mask in the previous rendering. We will move it by adding after var maskMesh = new THREE.Mesh(maskGeometry,maskMaterial) :

maskMesh.position.set(0,0.3,-0.35);

These coordinates have been set by making faceMesh global (By running window.faceMesh = faceMesh in the JavaScript console) and testing multiple positions in the JavaScript console of the web browser. Since the mesh is centered along the horizontal axis (right / left), the first coordinate is equal to 0 . The second coordinate (Y) is the offset along the vertical axis: 0.3 slightly moves the mask upwards. This value was determined by looking at the webcam from the front. The last coordinate Z is the offset along the axis of the depth. A negative coordinate moves the mask away from the camera. This value is adjusted by turning the head alternately right and left.

Luckily, the mesh scale is fine. In the opposite case, we could have enlarged or shrunk the mesh by modifying maskMesh.scale in the same way as the position.

The cut is better than the previous one. We are almost there!

still some pipes to plug…

In fragmentShader , replace the line vec3 finalColor=colorWebcam; by :

vec3 colorLineCode=texture2D(samplerVideo, uv).rgb; vec3 finalColor=colorWebcam+colorLineCode;

Let’s test the result. GLSL development is very iterative and fits well to live coding. Its debugging is often complex for algorithmic errors. It is not possible to insert breakpoints, or to proceed step by step execution … So the best way to avoid hard and long debugging is to regularly test the rendering. That’s what we do.

The lines of code look like prison bars, but the textures are both here!

In order to calculate the final color in the fragment shader, we need some additional variables:

vNormalView : normal vector to the point in the camera coordinate system,

: normal vector to the point in the camera coordinate system, vPosition : position vector in the mask coordinate system.

Since these values are defined for each point, we assign them in the vertex shader and retrieve them interpolated for each pixel in the fragment shader. Each pixel belongs to a single triangular face. When executing the fragment shader corresponding to the rendering of a pixel, the value of vNormalView for example will be interpolated between the vNormalView values of the 3 vertices of the triangular face depending on the distance from the point to each vertex.

We declare these values in the same way in both shaders, just before the void main(void) :

varying vec3 vNormalView, vPosition;

And in the vertex shader, we assign them at the end of the main function:

vNormalView=vec3(viewMatrix*vec4( normalize(transformedNormal), 0.)); vPosition=position;

transformedNormal , viewMatrix , and position are either already declared by THREE.JS because we are using a THREE.ShaderMaterial instead of a THREE.RawShaderMaterial , or variables computed in included shader chunks.

Let’s add a bit of refraction

We want the mask to distort the lines of code. At this purpose we will do as if the mask applied a refraction of Descartes on the lines of code. The incident ray has for unitary vector vec3 (0.,0.,-1.) because the Z axis (3rd coordinate) is the depth axis and it is directed towards the rear of the camera. We place ourselves in the coordinate system of the camera (view) where the normal to the point is vNormalView . We use the GLSL refract function to compute the direction vector of the refracted ray. Its last argument is the ratio of refractive indices. 0.3 would match to the passage of air (refractive index of 1.0 ) to a material even more refractive than diamond, of refractive index 3.33 .

In the fragment shader, replace vec3 colorLineCode=texture2D(samplerVideo,uv).rgb by :

vec3 refracted=refract(vec3(0.,0.,-1.), vNormalView, 0.3); vec2 uvRefracted=uv+0.1*refracted.xy; vec3 colorLineCode=texture2D(samplerVideo, uvRefracted).rgb;

Let’s try it! There is now an interaction with the lines of code. They are dynamically deformed by the face.

The lines of code are deformed in front of the face.

Coloring the webcam video

We want to color green the video from the webcam. Just after fetching the color of a texel from the webcam video with vec3 colorWebcam=texture2D(samplerWebcam,uv).rgb , we insert a line to calculate the value (i.e. brightness) of the color:

float colorWebcamVal=dot(colorWebcam, vec3(0.299,0.587,0.114));

The vector vec3(0.299,0.587,0.114) is the luma, it allows to weight the RGB colorimetric components in a similar way to the human eye (see the Wikipedia article about the grayscale conversion).

Then we reassign colorWebcam to colorWebcamVal * <green color> * <light intensity> :

colorWebcam=colorWebcamVal*vec3(0.0,1.5,0.0);

Test the code. The effect is not very aesthetic: the color is too green. The red and blue components of the colorWebcam vector are always null because of the applied formulas. So the brightest color is green instead of white.

We have built a Martian simulator!

We therefore add white lighting if the value reaches the threshold of 0.3 . It saturates if it is above 0.6 :

colorWebcam+=vec3(1.,1.,1.)*smoothstep(0.3,0.6,colorWebcamVal);

Colors are good. But it is not finished yet…

Resolution of border effects

Border effects are ruining rendering: there is no transition between the mask and the background. Removing these effects is often the most difficult aspect when designing this kind of face filter. It is crucial because these artifacts interfere with the coherence of the scene and produce a cut-and-paste editing effect of a kindergarten pupil.

The first step in reducing edge effects is to calculate coefficients which smoothly locate borders. Rather than performing a complex calculation to determine a single coefficient, it is better for debugging and simplicity of the code to calculate several of them applying to different borders. So we compute at the beginning of the fragment shader, just after void main(void) :

float isNeck=1.-smoothstep(-1.2, -0.85, vPosition.y); float isTangeant=pow(length(vNormalView.xy),2.); float isInsideFace=(1.-isTangeant)*(1.-isNeck);

isNeck equals 1 on the neck and 0 elsewhere. The neck is characterized by a position along the vertical axis (Y) below a threshold. Like the other border coefficients, it is preferable that it varies gradually to avoid the appearance of other border effects when we use it,

equals on the neck and elsewhere. The neck is characterized by a position along the vertical axis (Y) below a threshold. Like the other border coefficients, it is preferable that it varies gradually to avoid the appearance of other border effects when we use it, isTangeant is 1 when the face is tangent to the view and 0 when it is facing the view. We amplify the effect by applying a simple easing with the pow function,

is when the face is tangent to the view and when it is facing the view. We amplify the effect by applying a simple easing with the function, isInsideFace is 1 if the rendered pixel is in the face, and tends to 0 at the border of the face.

Only for debugging, at the end of the main function of the fragment shader, we control the relevance of our coefficients by adding:

gl_FragColor=vec4(isNeck, isTangeant, 0.,1.);

And we got this rendering:

The neck, materialized by isNeck , is red, and the rest of the edges of the face, materialized by isTangeant , are green.

We comment on the debug rendering statement. It has allowed us to check our border coefficients. Now we use them to implement a smooth transition between the background and the mask.

In order to remove the break in the lines of code between the mask and the background, we insert just before vec3 colorLineCode=texture2D(samplerVideo,uvRefracted).rgb :

uvRefracted=mix(uv, uvRefracted, smoothstep(0.,1.,isInsideFace));

We use smoothstep(0.,1.,isInsideFace) instead of isInsideFace directly to avoid tangential discontinuities of lines of code. It adds a double easing.

Then to remove the particularly unsightly border effects related to differences in brightness at the periphery of the mask, we replace: vec3 finalColor=colorWebcam+colorLineCode by:

vec3 finalColor=colorWebcam*isInsideFace+colorLineCode;

That’s it, here we are ! So, was it worth following the white rabbit?

Conclusion

I hope you have enjoyed this tutorial and gave you the desire to make your own face filters. THREE.js or GLSL programming are the subject of whole books and we only had time to fly over them. Fortunately, there are plenty of documentary resources and tutorials online. Here are some links if you want to dive deeper into the Matrix: