Yes, my AI face-swap attempts might show how hard it is to make a deepfake – but it’s getting easier every day

MPs from the House of Commons inquiry into fake news were warned last week of a new AI technology that is about to change the world, and not for the better.

“We’re rapidly moving into an era where the Russians, or any other adversary, can create our public figures saying or doing things that are disgraceful or highly corrosive to public trust,” Edward Lucas, the senior vice president of the Centre for European Policy Analysis told MPs. “And we’re not remotely ready for this.”

Lucas was talking about so-called deepfakes, which he described as “audio and video that look and sound like the real person, saying something that that person never has”.

Less than three months ago, producing such videos was a laborious process requiring a video editor, vast amounts of reference footage and years of experience. But in the first few months of this year, the technology has exploded into public availability.

I wanted to test the warnings of Lucas, and those before him, to find out how near we are to this brave new world. I found that while it’s true it is now easier than ever to generate a convincing fake video featuring two faces swapped, it’s still not exactly easy.

Face-swapped porn

Show a neural network enough examples of faces from two celebrities and it’ll develop its own mental model of what they look like, capable of generating new faces with specific expressions.

Ask it to generate a set of expressions on one face that are mapped onto a second face, and you have the beginnings of a convincing, automatically generated, utterly fake video. And so, naturally, the internet created a lot of porn.

In December 2017, just one person was doing this, by hand, posting his results to Reddit. By the start of January an app had been created, offering an easy-to-use piece of software to automate the process. By February, more than 90,000 people were members of a community dedicated to creating and sharing the explicit clips.



But “easy to use” can mean a lot of different things in tech communities, from “this command line tool won’t erase your hard-drive if you make a typo” to “shout ‘play Kesha’ at our ominous black cylinder and it will do as you say”. In January, FakeApp was closer to the first of those than the second; its cheery name suggested iPhone levels of ease, but it was a horrible piece of software.

Borrowing a workstation computer from Dell was the first step to getting it to run.Like a lot of AI tech, FakeApp is designed to repurpose gaming PCs with beefy graphics processors, which the Guardian sorely lacks. With the hardware sorted, the fight with the software could begin.

Trump on Johnson v Thatcher on May

An abortive Faceswap of Boris Johnson and Donald Trump.

As one might expect from an anonymous developer’s app built to enable would-be pornographers to violate bodily autonomy on an industrial scale, it’s not particularly easy to use.

Unlabelled options are everywhere, the only tutorial is a 20-minute YouTube video and it’ll chug for five minutes before simply failing with no explanation if you’ve done something wrong. And that’s if you’re lucky. If you’re not, you won’t find out for another eight to 24 hours of training, after which the system produces something that looks like this:

My Donald Trump on Boris Johnson test had failed in a number of ways. For one, it ran out of disk space halfway through. While the output of FakeApp is startlingly efficient, as with many neural networks, the process of training the network uses a huge amount of memory.

But it also turns out Johnson and Trump aren’t as compatible as newspaper cartoonists might have you think. Trump’s bright orange skin tone means it was always going to be odd to plaster his face on Johnson’s pasty-white body. What’s more, Trump’s inability to stand still or talk to camera means that the system is unable to spot his face for about half the frames, because it’s hidden by his hand or facing away from the camera.

Eventually, I had more success. A clip of Margaret Thatcher, taken from a 15-minute televised interview from the 1980s, worked better when pasted on to a short clip of Theresa May shot in February. The result looks, bafflingly, a lot like Victoria Wood:

A better faceswap of Theresa May and Margaret Thatcher.

While better, this swap still has issues. In Thatcher’s interview, she talks to an interviewer sitting off-screen rather than directly to the camera. That means the neural network never really learns what she looks like face on, resulting in some extremely odd frames in the final video.

For Edward Lucas those oddities are moot. He told MPs: “It’s all a little bit clunky. It works better in black and white than in colour. But in a way, slightly grainy clunky black and white video is more convincing, because it looks like something people might have shot on their phone.”

Morally grey and hard to reconcile

The limitations of current technology aren’t going to last long. Neural network researchers are building systems that can be trained on less and less information, and systems that can swap more than just faces but also heads, whole bodies and even voices. Among professionals, such swaps have become almost a party trick, done to show off how quick their system works, or how much more believable it is than the competition.

Jack Clark (@jackclarkSF) Here is the founder of SenseTime, Xiao'ou Tang, showing off a Trump<>Obama generative AI faceswap at launch for an MIT AI initiative that SenseTime is helping to fund. 2018 is delivering in terms of AI-geopolitics weirdness! https://t.co/emK9laObfu pic.twitter.com/iTHy5lVmwJ

At the other end of the spectrum, FakeApp and its ilk are getting better and easier to use by the day. During our month-long testing of FakeApp it has evolved from a jury-rigged bundle of command-line utilities to a fairly slick one-button application. Its community has survived being thrown off almost every social network going, usually for the creation of “involuntary pornography”, and regrouped on a forum run by the app’s (anonymous) developers.

They see themselves as harbingers of the fake news future, doing the world a service by spreading awareness of how easy it is to create convincing forgeries. “What we do here isn’t wholesome or honorable, it’s derogatory, vulgar, and blindsiding to the women that deepfakes works on,” one user wrote, in a widely-shared post now deleted along with the subreddit it was shared on.

“That said, the work that we create here in this community is not with malicious intent. Quite the opposite. We are painting with revolutionary, experimental technology, one that could quite possibly shape the future of media and creative design.

“No matter what happens, this technology was going to become a reality. Nothing could stop that. And ironically enough, the safest hands for it to be in might just be the general public with the power to desensitise it, rather than an exclusive few, with the power to exploit it.”

By this reading, the deepfakes crew are information accelerationists, hastening the demise of our shared reality to ensure that the painful transitory period, when no one can be sure what is real and what isn’t, is over as quickly as possible. If everyone and their dog knows about deepfakes, then maybe scepticism will reign and Lucas’s fears won’t be realised.

That morally grey view is hard to reconcile with the way the community talks among itself when it thinks no one is listening. Their focus tends to be directly on the pornography – with the celebrity face-swaps being only the tip of the iceberg. Frequent requests for help are made for producing face-swaps of ex-girlfriends, classmates and teachers, forming part of the reason the community keeps getting thrown off Reddit.

New ethical concerns have reared their heads, too: one moderator stepped in hard when it became clear that collections of photos of Elle Fanning, who is 19, and Emma Watson, who is 27, probably contained images of them taken when they were younger than 18. It’s not at all clear what the legal and moral ramifications are of using a dataset containing images of a woman as an adult and a child to teach an AI what she looks like in order to paste her face over a pornographic video of another woman – but no one wanted to take their chances.

And FakeApp is not the only consumer technology pushing the envelope. Adobe has software that can alter a recording of a real person as easily as editing a transcript of their remarks. Baidu, known in the West as China’s Google, can go one further, cloning a voice while altering its accent and gender.

It’s grim. But it’s not going to go away. The technology is publicly available, extensively documented, and the subject of research around the globe. This is our world now. As Lucas warned MPs: “Please don’t spend too much time looking in the mirror at what Russia did to us; look through the windscreen at what’s coming down the road. That’s much more dangerous.”