- Computers cannot yet solve the “cocktail party problem”—understanding speech when two or more people are talking at the same time.
- A number of groups are making good progress, though, using various methods.
- A multimedia feature, which is available at www.ScientificAmerican.com/apr2011/speech, describes the logic behind one leading approach in detail and allows you to test your own ability to separate overlapping streams of chatter.
More In This Article
You are at a party, and Alex is telling a boring story. You are much more interested in the gossip that Sam is recounting to Pat, so you tune out Alex and focus on Sam’s words. Congratulations: you have just demonstrated the human ability to solve the “cocktail party problem”—to pick out one thread of speech from the babble of two or more people. Computers so far lack that power.
Although automated speech recognition is increasingly routine, it fails when faced with two people talking at once. Computerized speech separation would not only improve speech-recognition systems, it could also advance many other endeavors that require the separating of signals, such as making sense of brain-scan images.
This article was originally published with the title Solving the Cocktail Party Problem.