New Software Gives Robots the Gift of Hearing

A new approach allows “smart” machines to understand sounds other than speech

Join Our Community of Science Lovers!

Robots can already discern and react to speech thanks to voice-recognition software such as the iPhone's Siri. But “smart” machines still struggle with most other sounds. “In some sense, it's almost a simpler problem, but there hasn't been a lot of work on noise in the environment,” says roboticist Joseph Romano of Rethink Robotics in Boston. “It hasn't been in the loop for robotic feedback.”

Now Romano is letting robots listen in on more than our conversations. He and his collaborators at the University of Pennsylvania have created a software tool called ROAR (short for robotic operating system open-source audio recognizer) that allows roboticists to train machines to respond to a much wider range of sounds. As described in a recent issue of Autonomous Robots, the tool's chief requirement is a microphone.

To begin training, the robot's microphone first captures ambient sounds, which ROAR scrubs of noisy static. Next the operator teaches ROAR to recognize key sounds by repeatedly performing a specific action—such as shutting a door or setting off a smartphone alarm—and tagging the unique audio signature while the robot listens. Finally, the program creates a general model of the sound of each action from that set of training clips.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The group tested ROAR on a one-armed robot, improving the machine's ability to complete specific tasks. In one scenario, the robot attempted to autonomously grasp and activate an electric drill. Without any sonic feedback, the robot only succeeded in nine out of 20 attempts, but its success rate doubled while using ROAR. If after grasping, the robot did not hear the whir of the electric motor, it adjusted its grip and tried again.

The next step is to ensure the system works in loud environments. Integrating audio into a robot's feedback loop alongside visual and tactile cues could someday allow robotic nurses to rapidly respond to cries for help or enable factory robots to react when something breaks. Although the technology is in early stages, Romano thinks the potential is enormous. “We haven't even begun to explore what we can do,” he says.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American