In a groundbreaking achievement, scientists from UC San Francisco and UC Berkeley have unveiled a remarkable brain-computer interface (BCI) that has enabled a woman suffering from severe paralysis due to a brainstem stroke to communicate through a digital avatar. This pioneering technology represents the first instance of synthesizing speech and facial expressions directly from brain signals. Moreover, the system boasts an impressive capability to translate these neural signals into text at an astonishing rate of nearly 80 words per minute, a substantial leap compared to existing commercial alternatives.
Dr. Edward Chang, the chair of neurological surgery at UCSF and a veteran researcher in the field of brain-computer interfaces for over a decade, expressed his optimism regarding this latest achievement. Published on August 23, 2023, in the prestigious journal Nature, the research breakthrough holds the potential to pave the way for an FDA-approved communication system driven by brain signals in the foreseeable future.
“Our mission is to restore a comprehensive and embodied means of communication, which mirrors the most natural way humans interact with one another,” remarked Dr. Chang, a prominent member of both the UCSF Weill Institute for Neuroscience and the Jeanne Robertson Distinguished Professor in Psychiatry. He further emphasized, “These advancements bring us significantly closer to materializing this solution for patients in need.”
Dr. Chang’s team had previously demonstrated the feasibility of translating brain signals into text in a man who had also suffered a brainstem stroke years ago. However, the current study marks a more ambitious endeavor: deciphering brain signals into the richness of spoken language along with the nuanced facial movements that accompany conversations.
The approach involved implanting a thin, rectangular array containing 253 electrodes onto the woman’s brain surface. These electrodes intercepted the brain signals that would typically control muscles in the tongue, jaw, larynx, and facial muscles—areas impacted by the stroke. A cable connected the electrodes to a computer setup through a port affixed to her head.
Over the course of several weeks, the participant collaborated with the research team to train the artificial intelligence algorithms embedded in the system. This training process entailed repeating various phrases from a vocabulary of 1,024 words to establish distinctive brain signal patterns associated with each sound.
Rather than training the AI to recognize complete words, the researchers adopted a system that deciphers words from phonemes—the elemental units of speech akin to letters in written language. This innovative approach required the AI to learn only 39 phonemes to decode any English word, enhancing accuracy and significantly boosting processing speed.
Sean Metzger and Alex Silva, graduate students in the joint Bioengineering Program at UC Berkeley and UCSF, jointly developed the text decoder. Metzger commented, “Accuracy, speed, and vocabulary are paramount. These factors grant users the potential to communicate nearly as swiftly and naturally as typical conversations.”
To recreate the user’s voice, the team devised an algorithm that synthesized speech, personalized to sound like her voice prior to the injury, utilizing a recording of her speaking at her wedding.
Animating the digital avatar proved equally remarkable, as the researchers employed specialized software from Speech Graphics—an AI-driven facial animation company—to simulate and reproduce facial muscle movements. Custom machine-learning techniques seamlessly integrated the software with brain signals as the woman attempted to speak. This integration translated her brain’s instructions into movements on the avatar’s face, mimicking jaw movements, lip gestures, tongue motions, and a range of facial expressions like happiness, sadness, and surprise.
Graduate student Kaylo Littlejohn, working alongside Dr. Chang and Prof. Gopala Anumanchipalli, PhD, a UC Berkeley professor of electrical engineering and computer sciences, described the significance of their work: “We’re bridging the severed connections between the brain and the vocal tract caused by the stroke. When the subject first synchronized speaking with the avatar’s facial movements, I recognized the profound impact this technology would have.”
In terms of future developments, the team aims to create a wireless iteration of the system, eliminating the need for physical connections to the BCI—a vital step forward in transforming the lives of individuals facing communication challenges.
Disclaimer: The views, suggestions, and opinions expressed here are the sole responsibility of the experts. No Daily Insight 360 journalist was involved in the writing and production of this article.