Paralysis had robbed the two women of their ability to speak. For one, the cause was amyotrophic lateral sclerosis, or ALS, a disease that affects the motor neurons. The other had suffered a stroke in her brain stem. Though they can’t enunciate clearly, they remember how to formulate words.
Now, after volunteering to receive brain implants, both are able to communicate through a computer at a speed approaching the tempo of normal conversation. By parsing the neural activity associated with the facial movements involved in talking, the devices decode their intended speech at a rate of 62 and 78 words per minute, respectively—several times faster than the previous record. Their cases are detailed in two papers published Wednesday by separate teams in the journal Nature.
“It is now possible to imagine a future where we can restore fluid conversation to someone with paralysis, enabling them to freely say whatever they want to say with an accuracy high enough to be understood reliably,” said Frank Willett, a research scientist at Stanford University’s Neural Prosthetics Translational Laboratory, during a media briefing on Tuesday. Willett is an author on a paper produced by Stanford researchers; the other was published by a team at UC San Francisco.
While slower than the roughly 160-word-per-minute rate of natural conversation among English speakers, scientists say it’s an exciting step toward restoring real-time speech using a brain-computer interface, or BCI. “It is getting close to being used in everyday life,” says Marc Slutzky, a neurologist at Northwestern University who wasn’t involved in the new studies.
A BCI collects and analyzes brain signals, then translates them into commands to be carried out by an external device. Such systems have allowed paralyzed people to control robotic arms, play video games, and send emails with their minds. Previous research by the two groups showed it was possible to translate a paralyzed person’s intended speech into text on a screen, but with limited speed, accuracy, and vocabulary.
In the Stanford study, researchers developed a BCI that uses the Utah array, a tiny square sensor that looks like a hairbrush with 64 needle-like bristles. Each is tipped with an electrode, and together they collect the activity of individual neurons. Researchers then trained an artificial neural network to decode brain activity and translate it into words displayed on a screen.
Most PopularPS5 vs PS5 Slim: What’s the Difference, and Which One Should You Get?By Eric Ravenscraft Gear13 Great Couches You Can Order OnlineBy Louryn Strampe GearThe Best Portable Power StationsBy Simon Hill GearThe Best Wireless Earbuds for Working OutBy Adrienne So
GearThey tested the system on volunteer Pat Bennett, the ALS patient, who is now 68 years old. In March 2022, a surgeon inserted four of these tiny sensors into Bennett’s cerebral cortex—the outermost layer of the brain. Thin wires connect the arrays to pedestals atop her head, which can be hooked up to a computer via cables.
Over the course of four months, scientists trained the software by asking Bennett to try to say sentences out loud. (Bennett can still produce sounds, but her speech is unintelligible.) Eventually, the software taught itself to recognize the distinct neural signals associated with the movements of the lips, jaw, and tongue that she was making to produce different sounds. From there, it learned the neural activity that corresponds to the motions used to create the sounds that make up words. It was then able to predict sequences of those words and string together sentences on a computer screen.
With the help of the device, Bennett was able to communicate at an average rate of 62 words per minute. The BCI made mistakes 23.8 percent of the time on a 125,000-word vocabulary. The previous record was only 18 words per minute—a record established in 2021, when members of the Stanford team published a paper describing a BCI that converted a paralyzed person’s imagined handwriting into text on a screen.
In the second paper, researchers at UCSF built a BCI using an array that sits on the surface of the brain rather than inside it. A paper-thin rectangle studded with 253 electrodes, it detects the activity of many neurons across the speech cortex. They placed this array on the brain of a stroke patient named Ann and trained a deep-learning model to decipher neural data it collected as she moved her lips without making sounds. Over several weeks, Ann repeated phrases from a 1,024-word conversational vocabulary.
Most PopularPS5 vs PS5 Slim: What’s the Difference, and Which One Should You Get?By Eric Ravenscraft Gear13 Great Couches You Can Order OnlineBy Louryn Strampe GearThe Best Portable Power StationsBy Simon Hill GearThe Best Wireless Earbuds for Working OutBy Adrienne So
GearLike Stanford’s AI, the UCSF team’s algorithm was trained to recognize the smallest units of language, called phonemes, rather than whole words. Eventually, the software was able to translate Ann’s intended speech at a rate of 78 words per minute—far better than the 14 words per minute she was used to on her type-to-talk communication device. Its error rate was 4.9 percent when decoding sentences from a 50-phrase set, and simulations estimated a 28 percent word error rate using a vocabulary of more than 39,000 words.
The UCSF group, led by neurosurgeon Edward Chang, had previously used a similar surface array with fewer electrodes to translate intended speech from a paralyzed man into text on a screen. Their record had been about 15 words per minute. Their current BCI is not only faster, it goes a step farther by turning Ann’s brain signals into audible speech voiced by a computer.
The researchers created a “digital avatar” to relay Ann’s intended speech aloud. They customized an animated woman to have brown hair like Ann’s and used video footage from her wedding to make the avatar’s voice sound like hers. “Our voice and expressions are part of our identity, so we wanted to embody a prosthetic speech that could make it more natural, fluid, and expressive,” Chang said during Tuesday’s media briefing. He thinks his team’s work could eventually allow people with paralysis to have more personalized interactions with their family and friends.
There are trade-offs to both group’s approaches. Implanted electrodes, like the ones the Stanford team used, record the activity of individual neurons, which tends to provide more detailed information than a recording from the brain’s surface. But they’re also less stable, because implanted electrodes shift around in the brain. Even a movement of a millimeter or two causes changes in recorded activity. “It is hard to record from the same neurons for weeks at a time, let alone months to years at a time,” Slutzky says. And over time, scar tissue forms around the site of an implanted electrode, which can also affect the quality of a recording.
Most PopularPS5 vs PS5 Slim: What’s the Difference, and Which One Should You Get?By Eric Ravenscraft Gear13 Great Couches You Can Order OnlineBy Louryn Strampe GearThe Best Portable Power StationsBy Simon Hill GearThe Best Wireless Earbuds for Working OutBy Adrienne So
GearOn the other hand, a surface array captures less detailed brain activity but covers a bigger area. The signals it records are more stable than the spikes of individual neurons since they’re derived from thousands of neurons, Slutzky says.
During the briefing, Willett said the current technology is limited due to the number of electrodes that can be safely placed in the brain at once. “Much like how a camera with more pixels yields a sharper image, using more electrodes will give us a clearer picture of what is happening in the brain,” he said.
Leigh Hochberg, a neurologist at Massachusetts General Hospital and Brown University who worked with the Stanford group, says 10 years ago few people would have imagined that it would someday be possible to decode the attempted speech of a person simply by recording their brain activity. “I want to be able to tell my patients with ALS, or brainstem stroke, or other forms of neurologic disease or injury, that we can restore their ability to communicate easily, intuitively, and rapidly,” Hochberg says.
Though still slower than typical speech, these new BCIs are faster than existing augmentative and alternative communication systems, writes Betts Peters, a speech-language pathologist at Oregon Health and Science University. These systems require users to type out or select messages using their fingers or eye gaze. “Being able to keep up with the flow of conversation could be an enormous benefit to many people with communication impairments, making it easier to fully participate in all aspects of life,” she told WIRED by email.
There are still some technological hurdles to creating an implantable device with these capabilities. For one, Slutzky says the error rate for both groups is still quite high for everyday use. By comparison, current speech recognition systems developed by Microsoft and Google have an error rate of around 5 percent.
Another challenge is the longevity and reliability of the device. A practical BCI will need to record signals constantly for years and not require daily recalibration, Slutzky says.
BCIs will also need to be wireless, without the clunky cables required of current systems so they can be used without patients needing to be hooked up to a computer. Companies such as Neuralink, Synchron, and Paradromics are all working on wireless systems.
“Already the results are incredible,” says Matt Angle, founder and CEO of Austin-based Paradromics, who wasn’t involved in the new papers. “I think we will start seeing rapid progress toward a medical device for patients.”