AI robots and self-driving cars might steal the headlines, but the next big leap in technology will be advances in voice services, according to Google’s head of search, Ben Gomes, who says that a better understanding of common language is crucial to the future of the internet.

“Speech recognition and the understanding of language is core to the future of search and information,” said Gomes . “But there are lots of hard problems such as understanding how a reference works, understanding what ‘he’, ‘she’ or ‘it’ refers to in a sentence. It’s not at all a trivial problem to solve in language and that’s just one of the millions of problems to solve in language.”

Gomes was speaking to the Guardian ahead of Google’s 20th anniversary on 24 September, more than seven years after Google launched its first voice service as simple speech-to-text for search.

Now built into Google’s search and its AI voice assistant which is embedded in billions of smartphones around the globe, voice recognition has become essential in developing countries with low literacy rates.

“It was not obvious to us that what seems like an advanced technology in the west, seems like a basic thing you need in countries like India. So it sort of flips the conversation around,” said Gomes, who was born in Tanzania and raised in Bangalore.

“Many languages in developing nations have never really had common keyboards – I studied Hindi for 10 years, but I wouldn’t know how to type it – so voice is much easier to use than typing.”



Google’s attempts to understand language aren’t new. Having cracked the basics, Google started with spelling correction in 2000, which was slightly more complex than just adding a dictionary, and then onto what Gomes refers to as “the softening of words”.

“You can think of initial search engines as finding words with hard boundaries – here’s the exact word you typed, I’ll try and find it in the title of the document.”

“But there’s a language that people use when they know about an area and another when they don’t. In English we generally know this as synonyms, but it’s a particular kind of synonymisation.”

For example, people may search for “how do I change the brightness of a monitor”, using a general word like “change” because they don’t know a more specific word. But those with more knowledge of the area would use “adjust” in both queries and documents. To find the user the right document you need to inject the jargon of the specialist area into their query, something that took Google over five years to develop.

Many hard problems stand in the way of computers truly understanding language on a human level, but for Gomes the future lies in “the notion that language will become easier to use for finding information”.

“You’ll be able to ask much more sophisticated queries and in more sophisticated ways. You’ll actually be able to carry on a conversation with Google.”