Researchers hope brain implants will one day help people who have lost the ability to speak to get their voice back—and maybe even to sing. Now, for the first time, scientists have demonstrated that the brain’s electrical activity can be decoded and used to reconstruct music.
A new study analyzed data from 29 people who were already being monitored for epileptic seizures using postage-stamp-size arrays of electrodes that were placed directly on the surface of their brain. As the participants listened to Pink Floyd’s 1979 song “Another Brick in the Wall, Part 1,” the electrodes captured the electrical activity of several brain regions attuned to musical elements such as tone, rhythm, harmony and lyrics. Employing machine learning, the researchers reconstructed garbled but distinctive audio of what the participants were hearing. The study results were published on Tuesday in PLOS Biology.
Neuroscientists have worked for decades to decode what people are seeing, hearing or thinking from brain activity alone. In 2012 a team that included the new study’s senior author—cognitive neuroscientist Robert Knight of the University of California, Berkeley—became the first to successfully reconstruct audio recordings of words participants heard while wearing implanted electrodes. Others have since used similar techniques to reproduce recently viewed or imagined pictures from participants’ brain scans, including human faces and landscape photographs. But the recent PLOS Biology paper by Knight and his colleagues is the first to suggest that scientists can eavesdrop on the brain to synthesize music.
“These exciting findings build on previous work to reconstruct plain speech from brain activity,” says Shailee Jain, a neuroscientist at the University of California, San Francisco, who was not involved in the new study. “Now we’re able to really dig into the brain to unearth the sustenance of sound.”
To turn brain activity data into musical sound in the study, the researchers trained an artificial intelligence model to decipher data captured from thousands of electrodes that were attached to the participants as they listened to the Pink Floyd song while undergoing surgery.
Why did the team choose Pink Floyd—and specifically “Another Brick in the Wall, Part 1”? “The scientific reason, which we mention in the paper, is that the song is very layered. It brings in complex chords, different instruments and diverse rhythms that make it interesting to analyze,” says Ludovic Bellier, a cognitive neuroscientist and the study’s lead author. “The less scientific reason might be that we just really like Pink Floyd.”
The AI model analyzed patterns in the brain’s response to various components of the song’s acoustic profile, picking apart changes in pitch, rhythm and tone. Then another AI model reassembled this disentangled composition to estimate the sounds that the patients heard. Once the brain data were fed through the model, the music returned. Its melody was roughly intact, and its lyrics were garbled but discernible if one knew what to listen for: “All in all, it was just a brick in the wall.”
The model also revealed which parts of the brain responded to different musical features of the song. The researchers found that some portions of the brain’s audio processing center—located in the superior temporal gyrus, just behind and above the ear—respond to the onset of a voice or a synthesizer, while other areas groove to sustained hums.
Although the findings focused on music, the researchers expect their results to be most useful for translating brain waves into human speech. No matter the language, speech contains melodic nuances, including tempo, stress, accents and intonation. “These elements, which we call prosody, carry meaning that we can’t communicate with words alone,” Bellier says. He hopes the model will improve brain-computer interfaces, assistive devices that record speech-associated brain waves and use algorithms to reconstruct intended messages. This technology, still in its infancy, could help people who have lost the ability to speak because of conditions such as stroke or paralysis.
Jain says future research should investigate whether these models can be expanded from music that participants have heard to imagined internal speech. “I’m hopeful that these findings would translate because similar brain regions are engaged when people imagine speaking a word, compared with physically vocalizing that word,” she says. If a brain-computer interface could re-create someone’s speech with the inherent prosody and emotional weight found in music, it could reconstruct far more than just words. “Instead of robotically saying, ‘I. Love. You,’ you can yell, ‘I love you!’” Knight says.
Several hurdles remain before we can put this technology in the hands—or brains—of patients. For one thing, the model relies on electrical recordings taken directly from the surface of the brain. As brain recording techniques improve, it may be possible to gather these data without surgical implants—perhaps using ultrasensitive electrodes attached to the scalp instead. The latter technology can be employed to identify single letters that participants imagine in their head, but the process takes about 20 seconds per letter—nowhere near the speed of natural speech, which hurries by at around 125 words per minute.
The researchers hope to make the garbled playback crisper and more comprehensible by packing the electrodes closer together on the brain’s surface, enabling an even more detailed look at the electrical symphony the brain produces. Last year a team at the University of California, San Diego, developed a densely packed electrode grid that offers brain-signal information at a resolution that is 100 times higher than that of current devices. “Today we reconstructed a song,” Knight says. “Maybe tomorrow we can reconstruct the entire Pink Floyd album.”