Updated

How Windows 10 dictation works

While dictation within Windows 10 is easy, editing is a pain.

Mark Hachman / IDG

Dictation within Windows has lived in the shadows for years. Finally, with Windows 10 and the Fall Creators Update (see our review!), dictating text is almost as easy as talking to Siri, Cortana, or Google.

Within Windows 10, you can turn on dictation with just a keystroke. It’s easy. I wrote this whole article with just my voice. I edited it, though, with my mouse and keyboard. It’s all part of Windows 10’s new emphasis on modality: first touch, then writing with a pen, voice control, and finally dictation.

Dictation has lived within Windows for years, though it’s been confined to the Control Panel, where users had to set up and configure dictation capabilities manually before they could actually use it in the real world. Within the upcoming Windows 10 Fall Creators Update, however, it’s been brought front and center. (Note: We used the Windows Insider builds to test, but we've confirmed that the dictation feature is present within the Windows 10 Fall Creators Update. It works in the same way.)

Mark Hachman / IDG

Dictation works in whatever text field you’d like, whether it be Word or a web app.

Launching dictation within Windows 10 is a snap. The WIN + H key instantly gets it started. That brings up a small window, which is actually the handwriting panel compressed to a single line. All dictation and navigation is completed orally, although you can stop at any time. Unfortunately, if you pause your dictation to edit using your keyboard, you’re forced to re-enter the WIN + H hotkey to resume dictation. In addition, if you pause for, say, five seconds, dictation stops automatically. A small beep signals when dictation begins or ends.

Dictation is easy within Windows; editing isn’t

Windows’ inability to switch easily between typing and dictation is probably the weakest element of the whole thing, because the accuracy of Windows dictation isn’t quite enough for you to be able to type with your voice routinely. While Windows is smart enough to occasionally recognize the proper context of (big W) “Windows,” it flubs other, seemingly commonplace words. Even 90-percent accuracy means that you’ll have to correct something manually in nearly every sentence.

Granted, the quality of your microphone plays a role, as does background noise. I used a Surface Pro 4 and a quiet conference room (albeit with air conditioning) as a test environment, and the overall experience was average at best. At home, with a noisier A/C unit and background noise, the experience differed. (In our review, we talk more about how we tested, along with a sample of how Windows' dictation did.)

Don't leap to the conclusion that you'll need a headset, though, as some modern tablets and laptops contain array microphones that can detect the subtle nuances that dictation depends upon. Still, you'll want to keep the keyboard and mouse handy.

Why? Because navigation is a pain. Trying to memorize the list of Windows commands, and use them in the context of a sentence, takes some doing. Here’s just a few: 

  • Say “press backspace” to inject a backspace character
  • Say “clear selection” to unselect the text that has been selected
  • Say “move to the start of the word” to move the insertion point to the start of the word
  • Say “go after <word or phrase>” to move the cursor to the first character after the specified word or phrase

...and so on. Microsoft says you can say punctuation words like “comma” and they will be inserted as punctuation, but that just isn’t always the case. Specialized characters, such as ellipses and em-dashes, simply aren’t recognized. And certain commands, such as “delete that” didn’t work regularly. In that case the only choice you have is to pull out your keyboard and start hitting backspace repeatedly. 

Unfortunately, that creates a sort of all-or-nothing scenario, where one has to either type or dictate—there’s no back-and-forth. Windows allows me to pull out a pen  and scroll or jot a note anytime I choose. Windows needs to do the same thing with speech, enabling a user to switch on the fly.

Is Windows better than competing software packages, such as Dragon NaturallySpeaking? No, not by a long shot. Windows simply doesn’t have the accuracy that a professional package like Dragon does, though it does pretty well in a pinch. I actually expected more of Windows, as I expected the speech engine to be based upon the way in which you speak to Cortana. Instead, it appears to be built upon the traditional dictation engine that’s been in Windows for the last decade. Either the accuracy need to improve, or some training functionality needs to be built in. 

There’s one big thing going for Windows 10, though: It’s free. Honestly, when we text or type a message to a friend, we don’t expect the accuracy to be perfect. Likewise, within Windows, a little bit of inaccuracy here or there doesn’t make much of a difference. Dictating this story, though, where accuracy is critical, was a somewhat painful experience. If Microsoft believes dictation to be a productivity tool, the overall experience needs to improve.  

This story was updated on Oct. 23 to reflect that the feature was live within the Windows 10 Fall Creators Update.