In 2012, Microsoft’s Rick Rashid blew an Asian audience away with a live translation of his speech into Mandarin. On Monday, Bing added some of that technology to Bing Voice Search, to cut down the processing response time of voice input into Windows Phone by half, while improving accuracy at the same time.
Microsoft said that it is rolling out updates to Windows Phone customers to greatly improve the accuracy of SMS messages that are transcribed using the service, as well as searches performed using voice. The accuracy of those transcriptions has been improved by 15 percent, Microsoft said, while the response time has been halved—from about a second to just about half that. The service also does a a better job of cutting out ambient noise.
“Better results and better latency,” Michael Tjalve, a member of the Bing Speech team, said in a video describing the improvements. “So you get better results from the speech recognizer, and you get it faster.”
The key, Microsoft said, was its use of Deep Neural Networks, a technique which mimics the use of the human brain. Last year, Microsoft’s chief research officer, Rick Rashid, spoke in front of an audience at a computing conference in Tianjin, China. Using a more advanced version of the technology being added to Windows Phone, Rashid amazed audiences with a tonally accurate, real-time translation of his speech into Mandarin.
That research included three different technologies: machine translation, text-to-speech conversion, and automatic speech recognition. The acoustic model and decoder for Bing Voice Search for Windows Phone was used in the China demonstration, Microsoft said.
The DNN technology basically tries to model how the brain interprets speech, and puts that on top of the phone’s silicon, said Stefan Weitz, another member of the Bing Speech team.
Microsoft said that it found that the DNNs can be used to improve the performance of one language, one to another. “The burden of transcribing such voluminous files can be reduced significantly when data from one language can help improve accuracy for another,” Rob Knies, a senior writer for Microsoft Research, wrote.
Microsoft has periodically improved the speech recognition capabilities and features of Windows Phone, and recently improved the translation capabilities of the Bing Translator app for Windows 8 based on the corresponding update to Windows Phone that was released last year.