This is disturbing: Google’s contractors who transcribe audio clips collected by its AI based Google Assistant can listen-in to sensitive information about users, including names, addresses, and details about their personal lives, Belgian outlet VRT News reports. Following the report, Google conceded that it “partner[s] with language experts around the world” and 0.2% of all audio snippets collected are reviewed. It also said that it reviews the collected audio “whether you’re speaking English or Hindi”. This indicates that information of Indian users of Google Assistant might also have their audio reviewed by Google’s transcribers. Google has been promoting its assistant in India via hoardings and even television advertisements.

MediaNama has reached out to Google with the following questions:

How many clips are generated everyday (In India)? What is the percentage of those clips that are heard by language experts? Is this program going to stop or will Google continue to let transcribers listen to audio clips? Why doesn’t Google Home’s Privacy policy page state anywhere that the collected audio might be heard by someone? Do you have a team of such “language experts” who transcribe the audio of Indian users?

0.2% of all audio clips is not a small number

Google said that its transcribers review around 0.2 percent of all audio snippets collected by Google Assistant. Here’s the problem with that: In India, Google Home’s market share is around 39% as of now. Android, which has the Google Assistant on phone, has the largest marketshare of phones in India. Even if, for example, 1 million conversations are recorded each day – which is a conservative assumption – it would mean that transcribers have access to 200,000 audio snippets each day, and these are the numbers for India alone. Imagine the numbers for markets like the US and Europe where more people use Google Assistant each day. The VRT News report claimed that Google has thousands of transcribers spread across the world.

Google Home’s privacy policy page doesn’t mention human reviewal of audio

The fact that Google Home’s privacy policy page nowhere explicitly states that the company uses human intervention to listen to audio clips, is a clear case of lack of disclosure to users, and in effect, outside the realm of what users have consented to: this is a violation of user privacy. The report said that of the thousands of clips that were accessed, 153 of those should never have been recorded since the command ‘Okay Google’ was not given “clearly” – this means that a lot of these clips, which included bedroom conversations, conversations between parents and their children, and professional phone calls containing lots of private information, were recorded without the user consent. Google Home’s privacy policy page doesn’t state anywhere that the device might record audio even when ‘Okay Google’ hasn’t been said clearly.

VRT also reported that Google’s claims of anonymising audio clips before reviewal isn’t entirely true. A transcriber told the outlet that often times it isn’t clear to them as to what is being said and transcribers have to then look up every word, address, personal name or company name on Google or on Facebook. This way, the identity of the person in the audio clip is discoverable without much effort.

Spooky AI assistants

This isn’t the first story that shows how our interactions with virtual assistants may not be as private and secure as we might believe. In April this year, we reported that thousands of Amazon employees around the world listen to users’ voice recordings captured on Alexa-powered Echo speakers. Amazon workers listen to the audio clips which they then transcribe, annotate and feed back into the software to improve Alexa’s voice recognition ability and to help it understand commands better. The teams are based out of Amazon offices in Boston, Costa Rica, India and Romania with each reviewer working for 9 hours a day working on 1000 audio clips per shift.