These days, speech recognition applications are commonplace, but it hasn’t always been that way. It wasn’t that long ago that you actually had to speak with a real live human being to get a telephone number or push the buttons on your telephone to gain access to your bank account. Just imagine how widespread speech recognition applications will be in five or 10 years.
The History of Speech Recognition
The best way to get a grasp on the future is to reflect on the past. We know that the development curve of a new technology tends to accelerate with each passing year. That’s certainly the case when it comes to speech recognition software. The enormous power of modern computing, especially the data handling capacity of the digital cloud, has placed speech recognition technology on the precipice of a new automation revolution. Speech recognition technology began its long journey to prominence way back in the early 1950’s. Before that, the reality of a speech recognition machine could only be found in the fertile minds of inventors and science fiction radio programs. Bell Laboratories broke through the speech recognition technology barrier in 1952. The Audrey system possessed limited capacity to recognize spoken numbers, but it was an omen of things to come. A decade later, IBM demonstrated its Shoebox machine to patrons of the 1962 World’s Fair in Seattle. The Shoebox understood a maximum of 16 English words.
Speech Recognition Race
The race to the moon received a good deal more attention, but the effort to perfect speech recognition technology actually inspired a greater number of nations. Technology development laboratories throughout the world, including Great Britain, Japan and the Soviet Union, began to develop hardware for the purpose of recognizing human speech. Early research focused on recognizing the speech patterns associated with four vowel and nine consonant sounds. Progress was slow, but it was only a matter of time until modern computing power would come to the rescue. The U.S. Department of Defense began to fund speech recognition research in the 1970’s. The DOD’s Speech Understanding Research, or SUR, program led to the development of a speech recognition system called Harpy by Carnegie Mellon. Harpy could only decipher 11 words, but the system introduced a more efficient search program called Beam Search. Speech recognition software is dependent on the power of a processor to search an extensive data base for a likely match. Threshold Technology and Bell Laboratories began research to develop a system that could interpret multiple voices in the 1970’s. All of a sudden, speech recognition systems were capable of interpreting thousands of words. The Hidden Markov Method, a statistical advance in predicting spoken words, replaced sound patterns as the primary means of interpreting speech. The frontier of automatic speech recognition was advancing exponentially.
It wasn’t long before commercial application of speech recognition technology began to advance. The medical profession, toy manufacturers and government agencies quickly put the technology to good use. The Kurzweil text-to-speech system could efficiently interpret 1,000 words, and IBM developed a machine capable of recognizing as many as 5,000 words. Nevertheless, these systems lacked the computing power to enable continuous reading. A pause after every word was inevitable. Automated speech recognition began to explode in the 1990’s. Increasingly powerful computer processors finally made speech recognition software viable for the average person. Dragon Dictate was the first consumer product, but very few shoppers could afford the $9,000 price tag. The improved Dragon NaturallySpeaking edition was released seven years later at a reduced cost of $695. The program supported continuous speech at around 100 words per minute, but the user had to train the software program for 45 minutes before getting down to business.
Bell South created the first voice portal in 1996, but it was Google and the mobile telephone that finally brought speech recognition software into the 21st century. The accuracy of speech recognition software was limited to about 80 percent until Google developed the Google Voice Search app for the iPhone. The enormous power of cloud computing and the utility of mobile technology was a match made in speech recognition heaven. Speaking to a cell phone is much more desirable than negotiating a tiny keyboard. The processing capability of the digital cloud finally made it possible for speech recognition software to prove its usefulness. Data analysis limitations are a thing of the past, and voice search applications are leading speech recognition technology into the future.
Reference: All Professional Dragon Products
Author: Jessica Kane