Use It Better: Getting the Most Out of Speech Recognition, by David Pogue

Four tips to improve how your computer understands you















Share on Tumblr

In my December 2010 column, "Talk to the Machine," I describe how speech-recognition programs have advanced. I’ve used such software since 1997, when a year-long bout with a painful wrist ailment called tenosynovitis forced me to try it. Over the years, I’ve picked up enough tips and tricks to give me 100 percent accuracy most of the time.

 

* Replace the headset. Nuance’s Mac and Windows dictation programs come with a cheap USB headset. A nicer one improves the accuracy. There are wireless Bluetooth models, headsets and even desktop microphones. They’re all reviewed and sold at emicrophones.com.

 

* Correct by voice. When the software gets a word wrong, don’t fix it by typing. Use the command “correct ‘oak wrap’” (or whatever the error was), and choose the correct alternative from the list that pops up. That trains the software, so it will never make that mistake again.

 

* Help it help you. Beginners tend to speak loudly, or slowly, or in short bursts. But speech-recognition apps are designed to transcribe normal speech. Think ahead so that you can utter complete sentences or phrases—and then say it conversationally.

 

* It’s not all Nuance. If you use Windows Vista or Windows 7, you already own an excellent, little-known dictation app. Windows Speech Recognition is not quite as accurate as Dragon NaturallySpeaking, and of course you have to supply your own microphone. But it lets you dictate into any program, it has a great interactive tutorial, and it even lets you control menu commands and mouse clicks. It’s a great way to see how you might get along with speech recognition without spending any money.



4 Comments

Add Comment
View
  1. 1. jzernitz 07:35 PM 11/22/10

    Interesting - I have to wonder how the software copes with the unbelievably slovenly speech patterns of many young - and some not-so-young - Americans, especially males. I suppose it can adjust for some very common things like failing to pronounce "t" - Innernet, Innernational, etc. - maybe even the common dropping of final 'd' - "worl" for "world." But just listen around you! Especially pro athletes in interviews! Or some tour guides - I was on a tour with a guide and the combination of Boston accent and totally slurred phrases left all the tourists totally baffled. Even my 20-year old granddaughter visiting from London said she couldn't understand a word he said. Amy reports of how the software copes with this?

    Reply | Report Abuse | Link to this
  2. 2. roblanchard@comcast.net 11:33 AM 11/23/10

    As a longtime Dragon NaturallySpeaking user, I understand that the software is “data driven”. A recent NPR program interviewed some Google scientists that have been responsible for their voice recognition initiative. Like Google, Nuance captures different pronunciations of the same word or combinations of words, probably hundreds of variations of accents, syllable emphasis, speed of speech, ages, pitch, even dropped letters, etc. The software then compares the dictation to this data set and makes a guess. I think that is why the training that was so required in the past is not needed and why voice recognition phone answering systems seem to work better. Unfortunately for us who have to endure the “butchering of the King’s English” or fortunately for us who dictate, this strategy accepts that mispronunciation is an unavoidable fact. Notwithstanding, Nuance makes a wonderful product; and David Pogue is correct when he says in the Scientific American article, “Every year Nuance releases another new version…. Usually it doesn’t add many new features. Instead it devotes most of its resources to a single goal: improving accuracy.”

    Reply | Report Abuse | Link to this
  3. 3. cwoodhouse 10:44 AM 12/10/10

    Thanks for your article, Mr. Pogue. I think you should have interviewed and described heavy duty industrial users of speech recognition: radiologists. The speech recognition tradoff is tipping towards the computers and away from transcriptionists. I myself use DNS 100% of the time, sometimes in the PowerScribe shell. However, there are plenty of high volume radiologists who cannot afford the time to correct the 1% errors. Voice recognition shifts the burden of editting from the transcriptionist to the radiologist, which is not always welcome. The greatest payback, however, is that report turn-around times are slashed from hours or days to minutes.

    Reply | Report Abuse | Link to this
  4. 4. dbtinc 09:29 AM 12/20/10

    I've used DNS for many years and though it has improved along with my facility in using it, a 1% error is still annoying and slows the process down on correction coupled with the loss of concentration. Interestingly in my case, it is very accurate with scientific language and less so with day to day - must be that slovenly speech pattern of teenage males poking its head thru!

    Reply | Report Abuse | Link to this
Leave this field empty

Add a Comment

You must sign in or register as a ScientificAmerican.com member to submit a comment.
Click one of the buttons below to register using an existing Social Account.

More from Scientific American

See what we're tweeting about

Scientific American Editors

More »

Free Newsletters


Get the best from Scientific American in your inbox

Solve Innovation Challenges

Powered By: Innocentive

  SA Digital
  SA Digital

Email this Article

Use It Better: Getting the Most Out of Speech Recognition, by David Pogue

X
Scientific American Magazine

Subscribe Today

Save 66% off the cover price and get a free gift!

Learn More >>

X

Please Log In

Forgot: Password

X

Account Linking

Welcome, . Do you have an existing ScientificAmerican.com account?

Yes, please link my existing account with for quick, secure access.



Forgot Password?

No, I would like to create a new account with my profile information.

Create Account
X

Report Abuse

Are you sure?

X

Institutional Access

It has been identified that the institution you are trying to access this article from has institutional site license access to Scientific American on nature.com. To access this article in its entirety through site license access, click below.

Site license access
X

Error

X

Share this Article

X