On April 1, 2009, Google unveiled Gmail Autopilot, a plug-in that promised to read and generate contextually relevant replies to the messages piling up in users’ inboxes. “As more and more everyday communication takes place over email, lots of people have complained about how hard it is to read and respond to every message,” the product page explained. “This is because they actually read and respond to all their messages.” For those who hadn’t registered the date, the terms-and-conditions page spelled out the joke: “No, we don’t plan to scan every one of your incoming messages and automatically send the perfect reply.”

On November 5, 2015, Google unveiled Smart Reply, a plug-in that reads and suggests responses to e-mails. This time the innovation actually exists, as part of the company’s Inbox app for Android and iOS. If Smart Reply thinks that it understands a message that requires an answer, it will suggest three options, alongside a cheerful invitation to “start composing your reply with one tap.” When I wrote to Matt Jones, the design director at Google Research, to say that it had been great to see him the other night and to ask whether he could put me in touch with the team behind Smart Reply, he sent back a screenshot of the options that the app had given him: “It was great seeing you too!” “It was fun!” “Will do!” He connected me with Alex Gawley, the product manager for Gmail, who said that I shouldn’t be offended. (“I mean, I’m sure it was great to see you.”)

The April Fool’s joke of 2009 began to be taken seriously in early 2015, Gawley told me, thanks to two developments. The Google Research team, which had recently acquired DeepMind, the company behind an arcade-game-winning form of artificial intelligence, was making rapid progress in language-related areas of machine learning, including translation and speech analysis. At the same time, Americans were reading more and more of their e-mails on mobile devices—fifty-three per cent of them, in fact, up from eight per cent in 2011. “It’s a little screen and a little keyboard, which means e-mail is easy to read and really hard to reply to,” Gawley said. He added that the combination of fat thumbs and autocorrect is “a real pain point for our users.”

Smart Reply uses what is known as an artificial neural network—an intimidating term for a particular kind of mathematical model—to tease out the patterns and probabilities that underlie e-mail communications. For privacy reasons, humans are not allowed to read Google’s vast corpus of e-mail messages. Machines, however, are, and by drawing on that data they can gradually sort sentences into “thought vectors,” or coördinates in linguistic space. In other words, by plotting similarities in context, word frequency, and sentence structure, the neural network can teach itself to recognize and group together the endless variety of ways that humans have developed to say much the same thing: “How does this afternoon look for a call?” “Can we talk later today?” Or “Does this P.M. work for a quick chat?” By trawling through the data again, the machine can then find and suggest the most typical responses to this particular thought vector: “Sure, what time were you thinking?” “Sure, anytime.” Or “Sure, what’s up?”

Earlier this year, Google researchers used the network to develop an intelligent chatbot with which they could discuss the purpose of life. (“To serve the greater good,” according to the machine.) Still, when the company’s engineers applied their neural net to the problem of e-mail, it did not work perfectly right away. “Most machine-learning work is actually about tuning,” Gawley said. The A.I. has a tendency to suggest replies that say the same thing in slightly different ways, which is less useful than offering users responses that represent a range of different likely replies. (For instance, “No, sorry, I’m busy” would have been a more useful alternative than “Sure, anytime” in the example above.) The team has corrected for this, to some extent, by adding a parameter that encourages the machine to choose disparate responses—ones that have sufficient distance between them when plotted as vectors in semantic space.

The early iterations of Smart Reply were overly affectionate. “I love you” was the machine’s most common suggested response. This was a touch awkward: because the model has no knowledge of the relationship between an e-mail’s sender and its receiver, it provides the same suggested responses whether you are corresponding with your boss or a long-lost sibling. “The team was really puzzled about this,” Gawley said. “It turns out that our internal testers are very affectionate and that ‘I love you’ is a very common thing for people at Google to say.” When the engineers inspected their model, they discovered that whenever an e-mail did not give a particularly strong signal as to the appropriate response, the machine hedged its bets with a declaration of affection. This fact may yet become the subject of philosophical inquiry—a scan of every Gmail message ever sent taught the machine that the proper response to ambiguity is an outpouring of love?—but, for Google, it was a bug rather than a feature. The solution: instruct the model to calculate the likelihood that, given an empty e-mail message, it would also suggest “I love you” as a response, and use that as a filter.

There was a user-experience facet to the issue, too. During testing, Gawley’s team found that if humans did actually want to write “I love you,” they preferred to type it themselves. This reluctance to allow a machine to speak on emotional or otherwise intimate matters also informed the length of Smart Reply’s suggestions. Although the model is capable of generating reasonably well-formed sentences of twelve to thirteen words, it is programmed to offer responses that are only five or six words long. “The longer the answer got, the more the end user felt that it was important that it was in their tone,” Gawley said. It’s fine to sound generic for six words, but by ten it gets uncomfortable.

This boundary between creepiness and helpfulness is one that Gawley and his colleagues will continue to probe. For now, they are working to make Smart Reply available in other languages, but he expects that they will begin adding another level of personalization before long. “I call it the exclamation-point model,” Gawley said. Some users sprinkle them throughout their communications—“My mum will put four in a row,” he said—while others deploy them more sparingly. The result is a simple, yet characteristic, pattern that a neural network could easily learn to mimic.

Because Smart Reply can suggest only short responses, it is selective in offering its assistance. Of the twenty e-mails I received in the first hour after the feature was activated in my inbox, twelve were deemed suitable for the Smart Reply treatment. Some suggestions were amusing but unhelpful: in response to an e-mail asking how much of a particular chemical I wished to order, Smart Reply thought I might want to say “I have no idea,” “I am not sure,” or “I was thinking the same thing.” Most of the choices, though, were entirely usable—and yet I struggled to use them. The experience triggered an uncomfortable tension in me. On the one hand, I was unwilling to accept how much of my correspondence could be managed with the same inane responses (“Sounds great!” was a recurring suggestion), and I wished to resist it. On the other hand, I suddenly realized how easy it would be to achieve the elusive, long-dreamed-of goal of inbox zero. If only I could bring myself to cut out a few pleasantries and personal touches!!!!