Are We All Just Lazy?
A. Michael Noll
December 11, 2017
© 2017 AMN
How many of our new products and services are motivated mostly by our laziness? However, the marketing folks would claim that they are just making life easier for us.
Television sets of the past had tuners with knobs. To change a channel, we had to get up from our sofas and go to the TV set to turn the knob to a different channel. This was so much effort that we usually just left the TV set on a single channel for the entire evening. And then the TV remote was invented. Now we could relax in our sofas and simply press a button to flip from one channel to another – the height of laziness.
Today we have voice-assisted products. All we have to do is simply speak to it to obtain information or to turn on a lamp. No longer do we have to search the Internet by typing on a keyboard. We just speak to our computers and voice-assistants.
Decades ago, AT&T was attempting to market its video teleconferencing service. But people thought it was easier to take the train than to schedule and walk across the street to a teleconferencing room.
It takes physical energy and effort to speak – it can be tiring. Somehow it is easier just to type or text a message. Perhaps it simply takes less physical effort and is less tiring. But if we do not have a keyboard immediately available, then speech is the way to go.
November 21st, 2017
September 22nd, 2017
August 29th, 2017
July 19th, 2017
Speech Recognition Reassessed
A. Michael Noll
© 2015 AMN
My new Garmin GPS navigation unit has made me reassess my previously negative opinion of automatic speech recognition. I am now impressed. But it has taken many decades for me to change my mind.
Back in the 1960s, when I was working in speech research at Bell Labs, speech recognition was in its infancy. Not only was the performance not very good, but also the applications were challenging to identify. A keyboard and knobs were far easier to use. Speech recognition a half-century ago required the largest computers that were then available – and they did not recognize speech in real time. Today’s speech recognition is much better and produces results in real time – and on devices we have in our cars or carry in our pockets. The technology has progressed significantly.
John R. Pierce (the famed father of Telstar) had written a paper “Whither Speech Recognition?” in the Journal of the Acoustical Society of America in 1969. He predicted a dismal fate for automatic speech recognition. I followed with my own paper taking a similarly skeptical view of automatic speech production.* I believed that graphical display of information was better than machines that spoke to us. But I did acknowledge that speech recognition might help a “driver to keep eyes on the road.”
We thought that imperfect automatic speech production would be more acceptable than imperfect speech recognition. That is because we believed that humans were better at understanding automatic synthesized speech than computers were at understanding human speech.
My Garmin represents the state of the art in both automatic speech recognition and production, as it not only recognizes speech but also creates synthetic-speech directions when navigating me along a route. Neither is perfect. Some pronunciations are comically wrong – and it will not recognize the names of some restaurants. Most of the time, it is great – but at other times, it is frustrating. But speech is much better – and safer — than attempting to touch the screen to enter data while driving.
Since my Garmin GPS unit sometimes will not recognize the correct pronunciation, I have to pronounce words incorrectly but in a way that it does recognize. My Garmin is making me conform to it, and I wonder whether we will over time have people with a Garmin accent!
I am told that the speech recognition by Google and Apple are very good. These systems send the speech to a remote cloud-based computer that has considerable processing power and speech-recognition software. But when using a computer, I still find it is easier to just type my request for information. Speaking to a computer, for me, just seems like more energy and effort. But I guess that if I used a smart phone, then speaking might be easier than typing on a small screen. However, at my old age, I am not smart enough for a smart phone!
*Noll, A. Michael, “Whither Speech Production?” The Journal of the Acoustical Society of America, Vol. 47, No. 6 (Part 2), June 1970, pp. 1614-1616.