Google Cloud's Text-to-Speech learns new languages, gets new voices Watch Now

Google has revealed a new feature it's bringing to the Google Translate app, which lets it transcribe audio from one language into another in near real time.

It's not quite live translation-transcription functionality but, according to The Verge, it will allow users to record speech on a smartphone in one language and after that have it immediately transcribed into another language.

Google showed off the translation feat at an event that ZDNet sister site CNET attended yesterday, where Google demonstrated a range of artificial-intelligence projects.

SEE: How to implement AI and machine learning (ZDNet special report) | Download the report as a PDF (TechRepublic)

Bryan Lin, an engineer on the Translate team, said the audio transcription feature will be available in the coming months.

The upcoming Translate transcription feature has been compared to the Google Recorder app on the Pixel 4 and later. Google's demo showed English audio being translated into Spanish text in "close to real time", according to 9to5Google.

The company also has its Android-only Live Transcribe, which is aimed at hearing-impaired people and allows them to see transcriptions of nearby speech on a smartphone screen – albeit in one language with written translations after the original language has been transcribed.

But like Live Transcribe, the transcription on the Google Translate app won't be happening on device, so users will need to rely on an internet connection and Google's cloud for processing audio to deliver transcribed text. That differs from Google Recorder, which can transcribe offline.

Presumably at some point in the future Google will use its AI to perform live transcriptions of one language to another on the device.

SEE: Hands on with the Langogo translator: the go-anywhere global language device

But live transcriptions of audio from one language to another sounds like a challenging task, even for today's powerful smartphones. Per The Verge, Google says Translate's transcription constantly evaluates whole sentence as the audio proceeds. Then it tries to add punctuation, adjust translations based on context, and adapt to accents and dialects.

Google expects its underlying AI models to improve significantly over time.