Word-of-Mind: Researchers Decode Words from the Brain's Auditory Activity

Interpreting signals from the brain's language-processing center may improve speech-recognition technology or provide a means for the severely disabled to communicate

By Daisy Yuhas

Join Our Community of Science Lovers!

Oh, to be a fly on the auditory cortex!

That, in a manner of speaking, is exactly what a group of researchers working in Berkeley and San Francisco have done. Measurements of electrical signals in the region of the brain that processes speech enabled the group to decode the words a subject was hearing—in essence, a form of neural eavesdropping.

The goal was far nobler than finding out what your boss really thinks of you or what is going on in the neighboring cubicle. The research sheds light on how the brain sorts out sounds and turns it into language. "The hope," says Brian Pasley, a post-doctoral researcher at the University of California, Berkeley, and lead author on the study, "is that this knowledge can be utilized to help restore communication in the severely disabled." The work could complement other efforts to reconstruct speech using muscle movements in the vocal tract, lips and tongue.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The researchers—who also hailed from the University of Maryland, College Park, Johns Hopkins University and the University of California, San Francisco—published their work today in PLoS Biology (pdf).

During the experiment, subjects listened to words on a loudspeaker or piped through earbuds: sometimes just isolated words like "jazz" or "property"; pseudo-words like "fook" and "nim"; and in a few cases full sentences. Later, the research team studied a record of this activity as it appeared in the brain's auditory cortex, the region that processes what is heard, allowing the comprehension of language, along with other sounds.

The subjects, 15 volunteers with normal language skills, also happened to be undergoing neurosurgical treatments for epilepsy or brain tumors. Because their brain activity was already being monitored at the cortex's surface for seizures, the researchers could examine these direct cortical measurements for their auditory study. Pasley explains that it would have been impossible to have access to such brain scans without these volunteers.

Pasley and colleagues crafted an algorithm—a computational model—to map the sound a listener was hearing to the electrode's measurements. The model could then "learn" how to match sound to the brain's electrical signals.

Next, researchers tested their model by turning the tables: Starting with a listener's brain activity, they used the model to reconstruct the word that a listener had heard. Specifically, the model reconstructed a sound, resembling but not immediately recognizable as a word. To close the loop, the researchers then looked through a set of 47 words to find one that most closely matched the model’s sound.

Not only could they successfully "eavesdrop" via cortical activity, the researchers created two versions of their model to account for different features of sound. One version of their computational model made use of a linear representation of sound, called a spectrogram, which plots frequency over time. The other version used a nonlinear representation of sound called a modulation model. Pasley explains that in the linear version, sound rhythms are coded by the brain’s oscillations whereas in the nonlinear version, rhythms are conveyed by the overall level of brain activity. At a slow speech rhythm, both models work well, but at faster rhythms the nonlinear sound representation creates a more accurate model.

This technique could improve speech-recognition technology. Although smart phones do a decent job, anyone who's received a cryptic Googlevoice transcription knows that speech recognition still is not perfect.

The work parallels research published this fall from U.C. Berkeley in another sensory realm—a computational model that reconstructed the images that subjects were watching in movie trailers.

An obvious follow-up question about Pasley and his colleagues' research: Will this make it possible to read words that we silently vocalize to ourselves, for example, "Oh no, not him again." Pasley explains that the research applies to actual sound a listener hears. Whether the same regions of the brain are involved in the words we sub-vocalize remains unclear.

The experiment demonstrates, though, that it does not take a mind reader to listen in on the subtle processing of the brain at work.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American