Speaking machines have been scaring us for centuries. Here a brazen head from the early modern period is depicted.

The star of Google’s 2018 I/O developer conference was a new technology they presented called Duplex. In the demo a computer can be heard calling a hair salon to make an appointment, and calling a restaurant to make a reservation. While very technologically impressive, the feelings we’re left with from listening to them isn’t awe at the technical marvel, but rather creepiness and fear. It’s strange to have such a visceral response to a computer making a phone call, so it’s worth examining to understand why.

Speech synthesis or text-to-speech is nothing new — people have been working on it since at least the 18th century if not earlier. Similarly, natural language processing (NLP) has been a mainstay of AI research since the birth of the field in the 1950’s. Both of these fields have flourished in recent years, taking advantage of vast hordes of data and always improving computing power to train artificial neural network models of increasing complexity. These advances are perhaps most apparent in the latest batch of home automation and voice assistant devices. With sometimes astonishing accuracy, these devices are able to answer queries and perform (still simple) tasks for us.

But Google’s Duplex demo was different. It wasn’t that it successfully navigated unscripted phone calls with unsuspecting people. That would have been impressive enough. After all, passing the Turing test has been the grand ambition of AI research since its inception! It’s not even the unsuspecting part of it, although that does raise a number of important ethical considerations we will return to shortly. (In the media attention after the demo Google acknowledged this and promised to always identify the technology at the beginning of a call.) The creepy part is the uptalk, the “Mm-hmm”, the “uh” in “Do you have anything between 10 am and — uh — 12 pm?”, the “um”s and the “gotcha”.

It’s these little “human” flourishes that are so deeply unnerving. That we now have to put the word human in quotes points to the heart of the issue. This little demo challenges our identity in a new and troubling way. We think we know what it means to be human. When someone uptalks or says a pregnant “ummmm”, we think we know what that means. A nervous laugh means something different than a boisterous one. We reveal (and conceal) ourselves in all these micro-interactions and we intuitively read them and respond accordingly.

That a computer can now convincingly portray any one of these emotional quirks breeds a new kind of distrust. Just as face swap challenges our ability to believe our eyes, this challenges our ears. The recent massive campaigns to interfere with numerous elections worldwide shows us where this technology is going. We are building agents of propaganda. Propaganda has never been difficult: promise utopia, blame an out group, tap into people’s fear and rage. What’s new is algorithmic propaganda and we won’t even know if we are being (mis)led by humans or machines.

Unfeeling machines simulating human emotion has a well understood counterpart in human society: psychopathy. Humans incapable of empathy, without any moral compass, yet who may convincingly fake these emotions in order to manipulate others to get what they want. Psychopathic machines are also nothing new. Killer AI is a sci-fi trope. From the Terminator to HAL 9000, it’s our most predictable technological fear. Interestingly, those killing machines were usually void of any friendly personality. From Arnold’s Teutonic roboticism to HAL’s flat monotone, their speech betrayed their emotional indifference. Imagine now that it didn’t. That is what Google demoed.

Predictably, over the years Google has dropped their founding motto of “Don’t be Evil”. That doesn’t mean that they’re evil in any way, but it does indicate a significant shift in management perspective. It could mean an acknowledgement of the realities of running an advertising driven business. Over the years they have gone from an indexer of web pages, to an indexer of us. They run on algorithms that thrive on data and are voraciously finding ways of studying us more closely. Companies are fiercely fighting for access to the minutia of our lives. This data powers statistical models that then get fed back to us to influence our behavior. Whether it’s to make us buy something or tilt an election, as a species we have shown interest in using this technology for selfish purposes. It being used for real evil isn’t inevitable, but we don’t have an unblemished track record of being good.

Around a dozen employees have quit Google and nearly 4000 signed a petition demanding the company end its participation in a project with the military. Project Maven is intended to help the military analyze drone footage faster. The backlash ended up being so great that google announced they would discontinue their work on the project after the current contract expires. But whether it’s Google or someone else, the work will continue. As Lt. Gen. John N.T. “Jack” Shanahan, who helped spearhead Project Maven said: “nothing in DoD should ever be fielded going forward without a built-in AI capability.” This is classic Terminator fear: hooking up AI directly to the weapons. The backlash shows we at least know to be afraid of that.

Recently, a video made by X (formerly Google X) leaked. X is the moonshot subsidiary of Alphabet tasked with inventing radical, transformational technologies. The video, made in 2016, is a thought experiment, pondering the nature of all this data and considering its use. It uses the analogy of “Lamarckian epigenetics” to compare our genetic code with our user data, and the notion of the selfish gene to reduce us to mere carriers of our data expressing its will. It describes a ledger of our user data that we feed and in return is used to influence our behavior. In the video they describe the data first being used to help us achieve our self-selected goals, then the ideals of Google, and ultimately scaled up to the level of sculpting whole societies or even humanity. Is that evil? No one is saying this is what they’re doing, but it shows that when talented people stare deeply into this creation, they see a machine that has the ability to eat us whole.

We also feel a sense of inevitability about this progress. Of course the machines will get smarter and sound more human. Of course corporate, national and other interests will try to control their use. This is just the nature of technological progress. We will adapt to election meddling and adapt to thought control. Maybe the corporations controlling our thoughts will decide dealing with climate change is in their best interests, so we finally will too. Maybe this is the level of coordination needed to solve the immense and seemingly intractable problems we face. Certainly there have been and continue to be governments that feel this way.

Maybe that’s what it’s come to: a fight for our independence. Our ability to think and behave individually, to determine for ourselves what our wishes are and direct ourselves in those pursuits. The machines aren’t going away, and neither their ability to coax and coerce us into behaving as they or their masters wish. So we’re going to have to figure out how to turn the tables on them. To take all this awesome power and distribute it fairly. To use it as tools to advance ourselves and our society, while avoiding its most pernicious abuses. In other words, it is a double-edged sword just like all our other technologies.

The Google Duplex demo gave us a glimpse of a new kind of dystopia. One where the machines are warm and friendly and seem to have our best interests at heart. All the while they are strategizing how to deploy psychological tricks in order to control us. So we’ve got some learning to do. Presumably over time we will come to trust our eyes a little less. We will marvel at all the wonder our devices present, while also noticing a slight inability to invest in them. None of this is new. All of this has been going on for quite some time. I guess it just takes the creep factor of hearing a computer fool an innocent person with an “Mm-hmm” and an “uh”, that makes us take stock of where we are and where all this could be going.