Why isn't Microsoft's answer to Siri built into Windows 8?
Windows 8 is supposed to be Microsoft's majestic OS reset—a dramatic overhaul designed to usher the Windows platform into the age of mobility. And Windows 8 is also Microsoft's bid to achieve feature parity with iOS and Android, the other two OS powerhouses in the mobile universe.
But one key feature—one hot, relevant, rock-star-caliber feature—is conspicuously absent from the Windows 8 repertoire: Intelligent, semantically aware voice control is nowhere to be found in the new OS.
iPads and iPhones have a voice dictation button built right into their virtual keyboards. And Google integrated its own set of deep voice control features into the Jelly Bean version of Android that was released earlier this year. So how come voice control isn't a forward-facing, marquee feature of Windows 8?
The short answer is that voice-control technology hasn't made it to laptops or desktops in a meaningful way for either PCs or Macs, and Windows 8, at least for the short run, is much more of a computer OS than a tablet OS.
In Windows 8 (as in Windows 7 and Vista), speech recognition remains relegated to the role of an “assistive technology” designed to help disabled customers use their PCs. The Windows Voice Recognition (WVR) feature in Vista and Windows 7 allowed users to control a few minor OS behaviors with their own voices, and users could also dictate text, all with varying degrees of success.
Relative to Windows 7, Windows 8 offers incremental accessibility improvements, but also demonstrates that there's no real desire on Microsoft’s part to make voice control a major feature of the OS. Windows 8 can recognize your voice if you're using a microphone and can carry out some simple commands, but it doesn't offer anything approaching the voice-controlled "personal assistant" experience that we find in Apple's Siri.
A missed opportunity
Microsoft didn’t always show so little interest in voice control. The software giant introduced Windows Speed Recognition (WSR) in Windows Vista, and at the time seemed very interested in putting all Windows users on speaking terms with their computers. The company also demonstrated a feature called “Windows Speed Recognition Macros," which enabled the OS to perform certain repetitive tasks in response to a voice command. Unfortunately, the feature required users to write their own macros (i.e. "open file" etc.), and, as a result, WSR was mostly used by advanced users.
Microsoft bought the “voice portal” company TellMe in 2007, and appeared poised to use the voice recognition technology it received in the deal to put voice command into Windows. But it was not to be. The TellMe technology ended up being used mainly for voice commands in Windows Phone 7 and 8.
For many of us, the iPhone 4S’s Siri feature was our first experience with a voice-recognition system that did more than just transcribe words and open windows. Indeed, Siri is something much deeper than a voice-recognition tool. It's a “personal assistant” that understands relatively nuanced wording, and performs many of the tasks we ask of our smartphones.
Siri lets us compose and send text messages and emails using voice alone. We can use it to schedule meetings, ask for directions, set reminders, and so on. And when it comes to search, Siri uses semantic technology to understand information requests spoken in plain English, like, “What is the largest city in Texas?”
Apple and Google are already racing to perfect semantic voice control for use in mobile devices, and Microsoft could have jumped in the fray as well, reviving voice recognition as a major feature in Windows 8. In fact, Microsoft could have leap-frogged over the competition by bringing semantic voice control to the desktop. This could have been the killer feature that persuaded legions of skeptical XP and Windows 7 users to make the jump to Windows 8.
Laptop and desktop PC manufacturers could have benefited greatly too. The industry is desperate to curtail sliding PC sales as more and more users show an interest in tablets. Intelligent voice recognition for laptops and desktops could have been the sticky feature that product managers crave.
Unfortunately, as it stands, PC manufacturers believe consumers primarily want voice command on their mobile devices, and are fine with manual keyboard control for their PCs.
“Most of the [voice control] R&D momentum is going to serve the mobile market—smart devices, namely phones and tablets, where there appears to be, at least in the short term, no end in demand,” says analyst Patricia Kutza of tech market research house BCC Research.
Voice for Ultrabooks
Intel, not Microsoft, may end up being the first big proponent of voice recognition in the PC industry. The chip maker has already worked with voice-recognition technology company Nuance to develop a voice recognition app for Ultrabooks called “Dragon Assistant.” Dragon Assistant runs natively on the computer, and can interact with third-party apps to do things like find and play music, compose emails, surf the web, watch video and use social media, among other Siri-like talents.
Nuance is currently the leading developer in the voice-recognition market. And it's an open secret that Nuance developed large parts of Siri (Apple has confirmed only that Nuance is a technology partner). The company also developed the VR system in Ford’s Sync in-car systems.
Nuance came into the voice control business by making Dragon Naturally Speaking, the best selling desktop dictation application on the market. Naturally Speaking also provides detailed web browsing for disabled people via voice commands. Nuance has since expanded the functionality of the product to allow users to do more things on the PC using voice.
The company says it has a strong interest in bringing a Siri-like experience to the laptop and desktop. “We believe there's a blurring of lines between form factors,” says Nuance VP and general manager of Dragon devices Matt Revis. “The mobile handset has driven a desire for speech as an interface in all form factors, including desktops and laptops.”
Revis says the absence of voice-based personal assistant functionality in Windows 8 has left the door open for third-parties like his company to step in and provide a solution. Still, he acknowledges that direct OS integration has its benefits:
“There could be advantages to having the personal assistant functionality built into the OS, around things like command and control,” Revis says. “This could mean commands like 'brighten the screen,' or 'go to sleep.'"
But Revis stresses that Dragon Assistant performs 80 percent of the tasks people do on their machines most often. And this includes interacting with other third-party apps for things like playing music using a music app.
If Intel and Nuance find success in building voice recognition into Intel’s Ultrabook platforms, Microsoft may be pressured into building voice command into its OS in future iterations.
The developer community may play a role, too. Says BCC Research’s Kutza: “It's possible Microsoft might be using a 'wait and see' approach, evaluating the feedback it gets from developers before integrating this functionality into Windows 8.”