Facebook has pledged to support research into AI mind reading technology in hopes that it could give a voice to those who have lost the ability to speak.

Scientists from UC San Francisco recently showed that brain activity recorded while research participants spoke could be used to create remarkably realistic synthetic versions of that speech. However, it took the researchers weeks or months to translate brain activity into speech – a far cry from the instant results that would be needed for such a technology to be clinically useful.

Now, in a complementary new study, again working with volunteer study subjects, the scientists have for the first time decoded spoken words and phrases in real-time from the brain signals that control speech, aided by a novel approach that involves identifying the context in which participants were speaking.

Speech neuroscientist Eddie Chang said: “For years, my lab was mainly interested in fundamental questions about how brain circuits interpret and produce speech.

“With the advances we’ve seen in the field over the past decade it became clear that we might be able to leverage these discoveries to help patients with speech loss, which is one of the most devastating consequences of neurological damage.”

Moses’ new study was funded by through a multi-institution sponsored academic research agreement with Facebook Reality Labs (FRL), a research division within Facebook focused on developing augmented- and virtual-reality technologies. As FRL has described, the goal for their collaboration with the Chang lab, called Project Steno, is to assess the feasibility of developing a non-invasive, wearable BCI device that could allow people to type by imagining themselves talking.

Patients who experience facial paralysis due to brainstem stroke, spinal cord injury, neurodegenerative disease, or other conditions may partially or completely lose their ability to speak. But the brain regions that usually control the muscles of the jaw, lips, tongue, and larynx to produce speech are often intact and remain active in these patients, suggesting it could be possible to use these intentional speech signals to decode what patients are trying to say.

Chang explained: “Currently, patients with speech loss due to paralysis are limited to spelling words out very slowly using residual eye movements or muscle twitches to control a computer interface.

“But in many cases, information needed to produce fluent speech is still there in their brains. We just need the technology to allow them to express it.”

As a stepping stone towards such a technology, Chang’s lab has spent years studying the brain activity that controls speech with the help of volunteer research participants at the UCSF Epilepsy Center.

These patients – all of whom have normal speech – have had a small patch of tiny recording electrodes temporarily placed on the surface of their brains for a week or more to map the origins of their seizures in preparation for neurosurgery. This involves a technique called electrocorticography (ECoG), which provides much richer and more detailed data about brain activity than is possible with non-invasive technologies. While they are in the hospital, some of these patients agree to let Chang’s group use the already-implanted ECoG electrodes as part of research experiments not directly related to their illness.

Researchers from the Chang lab. led by postdoctoral researcher David Moses, worked with three such research volunteers to develop a way to instantly identify the volunteers’ spoken responses to a set of standard questions based solely on their brain activity, representing a first for the field.

To achieve this result, Moses and colleagues developed a set of machine learning algorithms equipped with refined phonological speech models, which were capable of learning to decode specific speech sounds from participants’ brain activity. Brain data was recorded while volunteers listened to a set of nine simple questions (e.g. “How is your room currently?”, “From 0 to 10, how comfortable are you?”, or “When do you want me to check back on you?”) and responded out loud with one of 24 answer choices.

After some training, the machine learning algorithms learned to detect when participants were hearing a new question or beginning to respond and to identify which of the two dozen standard responses the participant was giving with up to 61% accuracy as soon as they had finished speaking.

Moses said: “Real-time processing of brain activity has been used to decode simple speech sounds, but this is the first time this approach has been used to identify spoken words and phrases.

“It’s important to keep in mind that we achieved this using a very limited vocabulary, but in future studies, we hope to increase the flexibility as well as the accuracy of what we can translate from brain activity.”

One of the study’s key findings is that incorporating the context in which participants were speaking dramatically improved the algorithm’s speed and accuracy. Using volunteers’ brain activity to first identify which of the predefined questions they had heard – which the algorithm did with up to 75% accuracy – made it possible to significantly narrow down the range of likely answers since each answer was only an appropriate response to certain questions.

Chang said: “Most previous approaches have focused on decoding speech alone, but here we show the value of decoding both sides of a conversation – both the questions someone hears and what they say in response.

“This reinforces our intuition that speech is not something that occurs in a vacuum and that any attempt to decode what patients with speech impairments are trying to say will be improved by taking into account the full context in which they are trying to communicate.”

The researchers emphasise that it is still unclear whether the approaches the lab is currently using to decode the brain activity of research subjects with intact speech – by training a computer-based on examples of their voices – can also work for people who are unable to speak. Instead, study participants may have to learn to use an implanted speech prosthesis over time based on ongoing feedback from their initial performance – a process that will make the kind of real-time speech decoding demonstrated by Moses’ new study even more critical.

Like this: Like Loading...