
For many years, you could never control your phone using your natural voice. Then the iPhone and other smartphones using Android, Windows and other OS’ emerged in the last decade. Things changed. Some mobiles offered smartphone voice control either built into the phone or through apps.
You could launch mobile programs like mapping with Google voice control or make a phone call, but that was about it. The processing power and speech recognition built into your phone limited speech commands.
But speech technology has greatly improved over the years and now with the new iPhone 4s Siri assistant, based on Nuance’s voice technology, you can control most functions with smartphone voice control. In fact, to my knowledge, this version of the iPhone is the first mobile device that incorporates natural speech.
Earlier in my career, I worked in the voice technology industry. I find it amusing now, after touch screen phone manufacturers convinced us how easy touch screens work, that we’re moving to voice control, something I’ve wished for in mobile phones for years. (Listen to my futuristic podcast about total voice control of mobile phones in a piece called “4G Wireless Future Mobile Phone Guy.”)
Voice Technology Development
For decades technology companies and audio engineers tried to perfect consumer voice recognition and text-to-speech in two different ways–speaker dependent (where you train a computer to understand your voice) and speaker-independent (no need for training.)
Everyone who’s reached a call center knows that speech recognition accuracy varies considerably based on your accent, background noise and other factors. Text-to-speech, when the computer responds to you, gradually improved over the years, leading to human-sounding computer speech. (For an interesting, short history of speech interactions with computers, read this fascinating story about Dragon Systems, one of the earliest voice control software companies, now owned by Nuance.)
Most smartphone users who have tried Android phones are familiar with Google’s speech recognition, mostly for navigation, mapping, making calls and launching programs. However, Google’s smartphone voice control isn’t natural speech like Nuance’s.
Whether mobile consumers adopt using their voices to control the iPhone in public is key to the 4S’ success. Can you imagine iPhone users in a bus or train station or airport terminal talking to their phones? Not much privacy, eh? Will the background noise interfere with voice recognition? Will iPhone users hear Siri’s responses in noisy environments?
Since the iPhone and other smartphones emerged during the past 4-5 years, user behavior has changed dramatically. Phone users today send text messages, email and go online to communicate with their friends and family. Fewer phone calls, more data. Enter the iPhone Siri voice assistant. It sounds like a great idea, one that could increase smartphone sales to people who avoided touch screens when first released.
Now a different problem arises, however. We live and work in noisy environments. And even if smartphone voice control users are in quiet places, they might not be private places. Will they want others passing by to know they’re dating a new person and scheduling dinner at a restaurant? What about medical appointments? Can you imagine someone saying out loud, “yes, please make an appointment with my psychiatrist at 4 p.m.?”
Watching how consumers adopt smartphone voice control will be fascinating, illuminating and determine the future of mobile voice controlled devices.