Video: Hear how speech is reconstructed using two different models

Neuroscientists are on the verge of being able to hear silent speech by monitoring brain activity (Image: Ojo Images/Getty)

Read more: “Mind readers: Eavesdropping on your inner voice“

Editorial: “Brain-eavesdropping tech can’t steal your thoughts“

When you read this sentence to yourself, it’s likely that you hear the words in your head. Now, in what amounts to technological telepathy, others are on the verge of being able to hear your inner dialogue too. By peering inside the brain, it is possible to reconstruct speech from the activity that takes place when we hear someone talking.


Because this brain activity is thought to be similar whether we hear a sentence or think the same sentence, the discovery brings us a step closer to broadcasting our inner thoughts to the world without speaking. The implications are enormous – people made mute through paralysis or locked-in syndrome could regain their voice. It might even be possible to read someone’s mind.

Imagine a musician watching a piano being played with no sound, says Brian Pasley at the University of California, Berkeley. “If a pianist were watching a piano being played on TV with the sound off, they would still be able to work out what the music sounded like because they know what key plays what note,” Pasley says. His team has done something analogous with brain waves, matching neural areas to their corresponding noises.

How the brain converts speech into meaningful information is a bit of a puzzle. The basic idea is that sound activates sensory neurons, which then pass this information to different areas of the brain where various aspects of the sound are extracted and eventually perceived as language. Pasley and colleagues wondered whether they could identify where some of the most vital aspects of speech are extracted by the brain.

The team presented spoken words and sentences to 15 people having surgery for epilepsy or a brain tumour. Electrodes recorded neural activity from the surface of the superior and middle temporal gyri – an area of the brain near the ear that is involved in processing sound. From these recordings, Pasley’s team set about decoding which aspects of speech were related to what kind of brain activity.

Sound is made up of different frequencies which are separated in the brain and processed in different areas. “Simply put, one spot [of neurons] might only care about a frequency range of 1000 hertz and doesn’t care about anything else. Another spot might care about a frequency of 5000 hertz,” says Pasley. “We can look at their activity and identify what frequency they care about. From that we can assume that when that spot’s activity is increasing there was a sound that had that frequency in it.”

Frequency isn’t the only information you can extract. Other aspects of speech, such as the rhythm of syllables and fluctuations of frequencies are also important for understanding language, says Pasley.

“The area of the brain that they are recording from is a pathway somewhere between the area that processes sound and the area that allows you to interpret it and formulate a response,” says Jennifer Bizley, an auditory researcher at the University of Oxford. “The features they can get out of this area are the ones that are really important to understanding speech.”

Pasley’s team were able to correlate many of these aspects of speech to the neural activity happening at the same time. They then trained an algorithm to interpret the neural activity and create a spectrogram from it (see diagram). This is a graphical representation of sound that plots how much of what frequency is occurring over a period of time. They tested the algorithm by comparing spectrograms reconstructed solely from neural activity with a spectrogram created from the original sound.

They also used a second program to convert the reconstructed spectrogram into audible speech. “People listening to the audio replays may be able to pull out coarse similarities between the real word and the constructed words,” says Pasley. When New Scientist listened to the words, they could just about make out “Waldo” and “structure”. However, its fidelity was sufficient for the team to identify individual words using computer analysis.

Crucial to future applications of this research is evidence that thinking of words promotes activity in the brain that resembles hearing those words spoken aloud.

“We know that for much of our sensory processing, mental imagery activates very similar networks,” says Steven Laureys at the University of Liège, Belgium. We need to be able to show that just thinking about the words is enough, which would be useful in a medical setting, especially for locked-in patients, he says.

“It’s something we’d like to pursue,” says Pasley. His isn’t the only team that is hoping to produce sound from thoughts. Frank Guenther at Boston University, Massachusetts, has interpreted brain signals that control the shape of the mouth, lips and larynx during speech to work out what shape a person is attempting to form with their vocal tract and hence what speech they are trying to make. They have tried out their software on Erik Ramsey, who is paralysed and has had an electrode implanted in his speech-motor cortex. At present the software is good enough to produce a few vowel sounds but not more complex sounds.

Laureys has also been working on ways to distinguish brain activity corresponding with “yes” and “no” answers in people who cannot speak. Other neuroscientists have been devising similar ways to communicate with patients in vegetative states, by monitoring brain activity using fMRI.

“Of course, it would be much better if one could decode their answers, their words and thoughts,” he says.

Auditory information is processed in a similar way in all of us so in that sense the new model can be applied to everyone, says Pasley, but the settings certainly need to be tuned to the individual because of anatomical differences across brains. He says the training process is short and involves listening to sounds. After the model is trained it can be used to predict speech – even words it has not heard before.

Pasley says the technology is available to turn this idea into a reality. “The implants transmit the recorded signals to a decoder that converts the signals into movement commands, or in our case, speech.” He wants to develop safe, wireless, implantable interfaces for long-term use.

No one will be reading our thoughts any time soon, says Jan Schnupp, also at Oxford, as only a small number of people having essential brain surgery will have these devices implanted.

But for those in need of a voice the work is a positive step. “It adds to the fascinating literature of decoding thoughts which is getting more and more precise,” says Laureys. “We have reason to be optimistic.”

We don’t know if speech from thoughts is possible yet, says Pasley – but he’s upbeat. “It’s certainly the hope.”

Journal reference: PLoS Biology, DOI: 10.1371/journal.pbio.1001251