Cover Image: April 2011 Scientific American Magazine See Inside

Solving the Cocktail Party Problem [Preview]

Computers have great trouble deciphering voices that are speaking simultaneously. That may soon change















Share on Tumblr



Image: Illustration by Bryan Christie

In Brief

  • Computers cannot yet solve the “cocktail party prob­lem”—­understanding speech when two or more people are talking at the same time.
  • A number of groups are making good progress, though, using various methods.
  • A multimedia feature, which is available at www.ScientificAmerican.com/apr2011/speech, describes the logic behind one leading approach in detail and allows you to test your own ability to separate over­lapping streams of chatter.

You are at a party, and Alex is telling a boring story. You are much more interested in the gossip that Sam is recounting to Pat, so you tune out Alex and focus on Sam’s words. Congratulations: you have just demonstrated the human ability to solve the “cocktail party problem”—to pick out one thread of speech from the babble of two or more people. Computers so far lack that power.

Although automated speech recognition is increasingly routine, it fails when faced with two people talking at once. Computerized speech separation would not only improve speech-recognition systems, it could also advance many other endeavors that require the separating of signals, such as making sense of brain-scan images.


This article was originally published with the title Solving the Cocktail Party Problem.



Subscribe     Buy This Issue

Already a Digital subscriber? Sign-in Now
If your institution has site license access, enter here.

2 Comments

Add Comment
View
  1. 1. JacobSilver 04:40 PM 4/12/11

    To solve the cocktail party problem, computers would first be able to analyze and recognize the peculiar wavelength character of one conversation. Then, they should be able to pick it out when immersed in the midst of two or more conversations. Seems doable.

    Reply | Report Abuse | Link to this
  2. 2. fooney 07:42 PM 4/13/11

    Humans achieve the cocktail party effect with substantial assistance from binaural hearing -- that is, the brain is adept at focusing attention on sound sources from specific and consistent spatial positions, which are determined largely by the .3 to .7 msec time delay between the two ears. Algorithms that emulate this approach, noting the time differential between two mics, should be able to fix the sound source in space, and then filter sounds from other locations.

    An opposite effect is achieved in some movie theatres, where the speakers are wired with improper phase alignment. The resulting sound presents pseudo time-delay information to the nervous system's binaural decoding faculty, rendering speech relatively unintelligible, presumably because it seems to be coming randomly from multiple sources and the attention can't get a fix on it.

    Reply | Report Abuse | Link to this
Leave this field empty

Add a Comment

You must sign in or register as a ScientificAmerican.com member to submit a comment.
Click one of the buttons below to register using an existing Social Account.

More from Scientific American

See what we're tweeting about

Scientific American Editors

More »

Free Newsletters


Get the best from Scientific American in your inbox

Solve Innovation Challenges

Powered By: Innocentive

  SA Digital
  SA Digital

Email this Article

Solving the Cocktail Party Problem: Scientific American Magazine

X
Scientific American Magazine

Subscribe Today

Save 66% off the cover price and get a free gift!

Learn More >>

X

Please Log In

Forgot: Password

X

Account Linking

Welcome, . Do you have an existing ScientificAmerican.com account?

Yes, please link my existing account with for quick, secure access.



Forgot Password?

No, I would like to create a new account with my profile information.

Create Account
X

Report Abuse

Are you sure?

X

Institutional Access

It has been identified that the institution you are trying to access this article from has institutional site license access to Scientific American on nature.com. To access this article in its entirety through site license access, click below.

Site license access
X

Error

X

Share this Article

X