Bare-Bones Program Learns English and Japanese Vowels

Computer model learns vowel sounds infant-style: on the fly

Join Our Community of Science Lovers!

A new computer model has learned to recognize vowel categories from multiple English and Japanese speakers without "knowing" the number of vowels it is looking for or having a complete list of sounds to analyze, according to a new report. Instead, it gradually lumps vowels into distinct groups by considering them one at a time, reminiscent of how an infant might attend to sounds.

The designers of the model say it is an early step toward improved voice recognition software and a better understanding of how the infant mind comes to recognize that the voices it detects are speaking one language and not another.

"We see this work as representing a movement towards thinking about language learning as an experience-dependent process," says James McClelland, professor of psychology at Stanford University and co-author of the report appearing online in Proceedings of the National Academy of Sciences USA.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Psychologist Janet Werker of the University of British Columbia in Vancouver recorded mothers in a laboratory speaking nonsensical sounds in English or Japanese to infants [click here for sound samples]. Both languages have five vowels that, roughly speaking, come in a long and short form, such as the English "bait" and "bet," which differ in frequency, whereas the Japanese vowels differ in the duration of the sound.

Trying to distinguish the "i" and "e" vowel forms for each language, McClelland, Werker and their colleagues converted each recorded vowel sound into three numbers that represented the duration of the sound and its two dominant frequencies. Then they fed these values into their model one vowel at a time.

The program placed each value on a continuum of many durations or frequencies that might possibly define a spoken vowel. The values reinforced particular durations or frequencies, gradually building up a three-dimensional space for each vowel form.

After such training, the program correctly categorized up to 93 percent of vowels in English and 92 percent in Japanese, the group reports.

McClelland says prior language learning models were less realistic, because they repeatedly scanned a large set of sound data instead of one sound at a time.

Incorporating similar procedures might allow speech recognition software to adapt to different speakers of the same language and thus boost its accuracy, he adds.

He says the new model is hard to compare with infant learning because researchers don't know what sounds infants hear. "But," he adds, "it's pretty successful at what it does and it uses a set of principles we think are on the right track."

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American