Behind the facial motion capture software that helped make Thanos

Recently, Digital Domain crafted a photoreal 3D version of Martin Luther King Jr. for TIME’S VR experience called ‘The March’. The 3D re-creation also appeared on the cover of TIME.

The VFX studio has, of course, rich history in the field of digital humans and human-like characters. From films such as Benjamin Button through to Avengers: Infinity War and Endgame. As well as some other recent immersive and interactive efforts such as ‘DigiDoug’ and, now, ’The March’.

With so many great CG characters coming out of Digital Domain, I wanted to look into what was at the center of their creation process. This turned out to be, as well as some insane artistry from the crew at the studio, DD’s proprietary facial capture software called Masquerade. It’s what the studio uses to take facial motion capture data from head-mounted cameras. and transform into high-resolution 3D data. Then, ultimately, inform the animation for a digital human or other CG character.

By looking back at some recent Digital Domain projects, and from talking to the studio’s in-house Digital Human Group. I thought I would present a quick history of Masquerade. An overview of its particular attributes, including the machine learning / artificial intelligence (AI) features.

Masquerade, in brief

Briefly, Masquerade works like this: a series of static scans and the high resolution tracked facial performance of an actor are captured. The data from these systems reveals all the ways the actor’s face can move. With this actor database, Masquerade can then use a few images of the actor’s facial performance to produce an accurate, high-resolution 3D version of the actor’s face.

This differs, somewhat, from having the actor only do capture via a fixed camera rig (where they are often seated). Since it means the actor can move around freely on set, but without losing the ability to capture a high-resolution dataset of their face.

Masquerade is a product created by DD’s Digital Human Group with major contributions and the guidance of software lead Lucio Moser.

A Masquerade timeline

Early development: Digital Domain presented Masquerade at SIGGRAPH 2017, where the concept of uprezing marker data was discussed. At this stage, Masquerade version 1 semi-manually tracked 3D marker points which were then transformed into the actor’s face.

Avengers: Infinity War: Digital Domain relied on Josh Brolin HMC data to help create their version of the CG Thanos character in 2018’s Avengers: Infinity War. Artist-driven correctives offered the ability for an artist to adjust the results of the system by adding more details or changing the way it looks on a character.

Avengers: Endgame: By the time of Endgame in 2019, the studio had incorporated almost fully automated marker tracking. Also added was greater robustness to extreme lighting conditions on the set, as well as to fast head movement during the performance, which would displace the helmet. These two combined improvements gave more freedom to the actor and the design of the recording set.

Masquerade Live: Concurrently, Digital Domain was at work on a real-time version of the toolset called Masquerade Live, spearheaded by senior software engineer Mark Williams, which jumped on new of deep learning innovations (see more on this below). This was utilized for DigiDoug and for a real-time Pikachu that was driven by Ryan Reynolds as part of the Detective Pikachu publicity run.

‘The March’: This 2020 release of the Martin Luther King Jr. TIME project saw Masquerade embrace fully automated marker tracking and automated eye gaze tracking.

Why Masquerade was created in the first place

Typically when an actor is driving a digital character, their facial motion is captured using a HMC system and the actor would have markers painted on their face. Previously, these were tracked by hand and would drive a 3D face rig of the character. From Digital Domain’s point of view, this system worked but did not capture all the subtlety of facial motion that goes into an actor’s performance. Data from this system would still have to be massively modified and adjusted before it matched the intent of the performance.

So, the studio’s aim with Masquerade was to create machine-learning software that would use the same images from the portable helmet system. But would also automatically produce high resolution, accurate geometry of the actor’s face down to each wrinkle instead of just a few moving points. It let all that detail in the actors’ facial performance be captured. Then the 3D data captured used to translate the performance to the actor’s digital self or character. Then, preserve as much of the performance as possible.

The AI side of Masquerade

DD runs down, here, how Masquerade uses AI:

1. To automatically track the markers and generate their 3D positions from the HMC images – previously a very laborious task

2. To automatically figure out how the actor’s head is moving in relation to the helmet, that’s also called head stabilization. This process removes from the performance any motion that is not from the actor’s facial performance, like, the camera bouncing from rapid head movements

3. To automatically figure out which markers are missing/occluded and fill them in with the correct motion they should be making

4. To up-rez marker geometry to final high-resolution geometry. For Masquerade, the actor is captured in a seated capture system that tracks thousands of points on their face (this is separate from the HMC capture). This system gives us a large database of data around how the actor’s face moves, as well as all the shapes that the actor can make with their face. Masquerade then uses machine learning to take the low-res 3D markers from the actor’s face and that large database to produce high-resolution data for every frame of their HMC performance

5. To enable artist-driven correctives – an actor’s face is one of the main drivers for a 3D character. However, sometimes there’s a desire to want to change the performance a bit to insert artistic changes. Maybe the actor smiled in one way, but the character needs to smile in a different, more menacing way. Masquerade has a machine learning component that lets artists train the system to change an actor’s expression into a specific character expression

6. To assist with eye gaze and eyelids tracking

The difference between offline and real-time

A character like Thanos was created with the ‘offline’ version of Masquerade, while DigiDoug represents the real-time Masquerade (known as Masquerade Live). So, how does Masquerade change from offline to real-time?

Masquerade Live is based on deep learning and requires data for its training. This actual data is provided with Masquerade. Digital Domain starts the training process by capturing an actor running through a range of performances, which gives them accurate, super high-fidelity facial data. The studio uses this data to train deep learning networks, from which the system learns how to take a single image of the actor’s face and create a high-resolution moving version in real-time, at 60 fps. Masquerade Live can also be trained to recover secondary geometry, such as jaw bone, eyebrow, and eyelash data, as well as predict blood flow for specific poses.

The future of Masquerade

Digital Domain can’t reveal too much about which upcoming projects have incorporated Masquerade. They did mention an undisclosed Triple-A title and two feature films. New developments with Masquerade have allowed for the ability to understand how the markers may be painted onto an actor slightly differently every day.

The studio has also been overhauling its pipeline to process. It says, 10-times more data in just a few weeks than was used on the Avengers films. Tracking subtle facial motions and skin dynamics have also been part of the new developments.