More In This Article
-
The Best Science Writing Online 2012
Showcasing more than fifty of the most provocative, original, and significant online essays from 2011, The Best Science Writing Online 2012 will change the way...
Read More »
In my December 2010 column, "Talk to the Machine," I describe how speech-recognition programs have advanced. I’ve used such software since 1997, when a year-long bout with a painful wrist ailment called tenosynovitis forced me to try it. Over the years, I’ve picked up enough tips and tricks to give me 100 percent accuracy most of the time.
* Replace the headset. Nuance’s Mac and Windows dictation programs come with a cheap USB headset. A nicer one improves the accuracy. There are wireless Bluetooth models, headsets and even desktop microphones. They’re all reviewed and sold at emicrophones.com.
* Correct by voice. When the software gets a word wrong, don’t fix it by typing. Use the command “correct ‘oak wrap’” (or whatever the error was), and choose the correct alternative from the list that pops up. That trains the software, so it will never make that mistake again.
* Help it help you. Beginners tend to speak loudly, or slowly, or in short bursts. But speech-recognition apps are designed to transcribe normal speech. Think ahead so that you can utter complete sentences or phrases—and then say it conversationally.
* It’s not all Nuance. If you use Windows Vista or Windows 7, you already own an excellent, little-known dictation app. Windows Speech Recognition is not quite as accurate as Dragon NaturallySpeaking, and of course you have to supply your own microphone. But it lets you dictate into any program, it has a great interactive tutorial, and it even lets you control menu commands and mouse clicks. It’s a great way to see how you might get along with speech recognition without spending any money.




See what we're tweeting about




4 Comments
Add CommentInteresting - I have to wonder how the software copes with the unbelievably slovenly speech patterns of many young - and some not-so-young - Americans, especially males. I suppose it can adjust for some very common things like failing to pronounce "t" - Innernet, Innernational, etc. - maybe even the common dropping of final 'd' - "worl" for "world." But just listen around you! Especially pro athletes in interviews! Or some tour guides - I was on a tour with a guide and the combination of Boston accent and totally slurred phrases left all the tourists totally baffled. Even my 20-year old granddaughter visiting from London said she couldn't understand a word he said. Amy reports of how the software copes with this?
Reply | Report Abuse | Link to thisAs a longtime Dragon NaturallySpeaking user, I understand that the software is “data driven”. A recent NPR program interviewed some Google scientists that have been responsible for their voice recognition initiative. Like Google, Nuance captures different pronunciations of the same word or combinations of words, probably hundreds of variations of accents, syllable emphasis, speed of speech, ages, pitch, even dropped letters, etc. The software then compares the dictation to this data set and makes a guess. I think that is why the training that was so required in the past is not needed and why voice recognition phone answering systems seem to work better. Unfortunately for us who have to endure the “butchering of the King’s English” or fortunately for us who dictate, this strategy accepts that mispronunciation is an unavoidable fact. Notwithstanding, Nuance makes a wonderful product; and David Pogue is correct when he says in the Scientific American article, “Every year Nuance releases another new version…. Usually it doesn’t add many new features. Instead it devotes most of its resources to a single goal: improving accuracy.”
Reply | Report Abuse | Link to thisThanks for your article, Mr. Pogue. I think you should have interviewed and described heavy duty industrial users of speech recognition: radiologists. The speech recognition tradoff is tipping towards the computers and away from transcriptionists. I myself use DNS 100% of the time, sometimes in the PowerScribe shell. However, there are plenty of high volume radiologists who cannot afford the time to correct the 1% errors. Voice recognition shifts the burden of editting from the transcriptionist to the radiologist, which is not always welcome. The greatest payback, however, is that report turn-around times are slashed from hours or days to minutes.
Reply | Report Abuse | Link to thisI've used DNS for many years and though it has improved along with my facility in using it, a 1% error is still annoying and slows the process down on correction coupled with the loss of concentration. Interestingly in my case, it is very accurate with scientific language and less so with day to day - must be that slovenly speech pattern of teenage males poking its head thru!
Reply | Report Abuse | Link to this