Computer-generated speech has improved during the past decade, becoming significantly more intelligible and easier to listen to. But researchers now face a more formidable challenge: making synthesized speech closer to that of real humans--by giving it the ability to modulate tone and expression, for example--so that it can better communicate meaning. This elusive goal requires a deep understanding of the components of speech and of the subtle effects of a person's volume, pitch, timing and emphasis. That is the aim of our research group at IBM and those of other U.S. companies, such as AT&T, Nuance, Cepstral and ScanSoft, as well as investigators at institutions including Carnegie Mellon University, the University of California at Los Angeles, the Massachusetts Institute of Technology and the Oregon Graduate Institute. Like earlier phrase-splicing approaches, the latest generation of speech technology--our version is code-named the IBM Natural Expressive Speech Synthesizer, or the NAXPRES Synthesizer--is based on recordings of human speakers and can respond in real time. The difference is that the new systems can say anything at all--including natural-sounding words the recorded speakers never said.