[Below is the original script. But a few changes may have been made during the recording of this audio podcast.]
Please note: this podcast is longer than the usual one minute as it includes quotes from an interview conducted with David Poeppel, at the American Association for the Advancement of Science meeting in Chicago.
Here’s a challenge: if we want machines to do what we do, we better understand ourselves as computational beings first.
Take language. How do we go from squiggly waveforms in the ear to ideas in the head. “To do that, you have to be able to do two things to recognize words,” that’s David Poeppel, professor of psychology at New York University, “You have to actually extract the order of the...phonemes themselves, like pets versus pests, but you have to do something else, you have to extract that part of the syllable structure that carries the intonation."
The brain must understand the tiny short one-letter difference between pets and pests, and then also understand the longer tones of the syllables. (Syllables are universally about one quarter of a second.)
So the brain needs to simultaneously analyze two things at very different speeds. Apparently we have “multiple clocks going on, multiple brain rhythms that actually analyze the signals simultaneously.”
But how do you do that?
“You basically chop up the auditory world into rapid small chunks,” these are the phonemes and then, “and slightly bigger chunks,” which are the syllables. Even though our brains chop up words into two discreet units of time, we have the perceptual illusion of sounds coming in as a continuous stream. By scanning brains Poeppel has linked language to neurons that fire at different frequencies.
What’s all this got to do with machines?
Well we want computers and cell phones to recognize every word we utter, and yet speech recognition software is, "...terrible. Just terrible. And one of the reasons is the way the machines are built is nothing like the way the human brain does it. And so one of the goals we have of course is to use some of the insights from human psychophysics and in particular neuroscience and explore what the consequences are for automatic speech recognition. It would be very helpful if you could walk up to a machine and say, hi it's me, I need forty bucks from my checking account, and you can verify it by my voice, but not just for a small, closed vocabulary, but...I mean our conversation is remarkable, you've never heard my voice, you've never seen me before, and you've never heard any of the stuff I'm telling you now. How is it that you understand it immediately? Sort of reflexively? We must have devices that allow you to normalize my weird voice and my weird face and extract the words and match them to the dictionary stuff that is stored in your brain. I mean that is pretty remarkable. Some sounds are coming out of my mouth, the squiggles are going to your ear and those squiggles get translated to ideas, that match ideas, or don't match ideas, in your brain. The fact that you can do it all is kind of miraculous."
And to replicate this miracle scientists need to crack the so-called neural code. Hm, it’s likely then, that those automated voice menus on phones are going to frustrate us for many more years to come.
—Christie Nicholson