Anh Luu
April 10, 2017

You could compare the birth and growth of speech recognition with that of a child growing up because of its striking similarities. Like a small child, the first speech recognition software had a tiny vocabulary bank consisted only of numbers. As the years flew by, voice recognition grew rapidly, fueling inspiration for science fiction movies like Star Wars (1977) and comic books like Iron Man (1968). Interestingly enough, the early 1970s was when voice recognition took off.

Speech Recognition: Infancy

While the interactive voice response system came about in the 1930s, voice recognition came two decades later because it was arguably a much more difficult endeavor. Unlike the IVR, which produced speech and could be manually ‘played’ by an operator like a piano, voice recognition was purely intended to understand speech, much like a sophisticated brain. The first voice recognition system was developed in 1952, and was called the Audrey system. Like the IVR, the Audrey system was also designed by Bell Laboratories. Audrey only recognized numbers spoken by one voice profile, like a baby recognizing only its mother’s voice.

One decade later, in 1962, the Shoebox system was developed to understand words rather than digits. Still tender in age, Shoebox only understood 16 English words. Still, this paved the way for other labs in the U.S., England, Japan, and the Soviet Union to develop their own speech recognition hardware. In the mid 1960s, the average speech recognition system supported four vowels and nine consonants, which was an incredible feat at the time.

Speech Recognition: Adolescence

The 1970s was when speech recognition really began to take flight. Because of its small but significant advances in the ‘60s, the U.S. Department of Defense took notice and began investing in this new and exciting technology. Carnegie Mellon’s ‘Harpy’ became the most advanced voice recognition system at the time, with a vocabulary bank of 1,011 English words. Harpy was essentially a three year old.

Harpy’s invention gave way to more speech recognition technology, making the ‘70s the golden era for speech recognition. Threshold Technology was the first to create a commercialized system, followed by Bell Lab’s invention of a machine that could understand multiple voice profiles. Unsurprisingly, the rise of speech recognition also influenced major works of film and comics, like the introduction of J.A.R.V.I.S. in Iron Man and C-3PO in Star Wars, whose roles were to solely help humans with their ability to understand speech and carry out commands.

Speech Recognition: Adulthood

The real breakthrough in speech recognition came after it had already become popular through media coverage. Things began to change when labs adopted a different method—the ‘hidden Markov model.’ This model allowed the speech recognition system to identify new sounds as words, making its vocabulary infinite rather than limited to a few thousand words. This new learning algorithm made the new speech technology widely available for commercial purposes. It was also during this time that the IVR truly integrated with speech recognition, making call routing and customer service in general a more efficient and cost-effective experience.

Nowadays, speech recognition has become so advanced that hardware has been completely replaced by fully-integrated software, allowing individuals to perform complex computer tasks by simply speaking commands, and companies to serve a larger audience without spending a dime more than they need to.

