Stanford Program Decrypts Captchas, Spam Bots Inbound
By Kevin Lee
You know those annoying boxes of jumbled, multicolored, slanted, crossed-out, or reversed text that are practically indecipherable to prove to a Website that you are human? They’re called captchas, and a team of Stanford University researchers may have just completely debunked this text-based security system with a computer program than can decode them.
The researchers–Elie Bursztien, Matthieu Martin, and John Mitchell–developed their Decaptcha program just to break the computer-perplexing text system. Decaptcha uses a five-stage process to remove noise from images and detect letters from shapes.
First, the program pre-processes the image for any noise or obscuring lines, then the segmentation stage separates each of the shapes. Post-segmentation analyzes each of the shapes, recognition approximates a letter to each shape, and finally, the program runs a spell check in post-processing.
Against all typographic adversity, the Decaptcha program was able to defeat 66% of captchas on Visa’s Authorize.net payment site; 70% at Blizzard Entertainment; a quarter of the ones used by Wikipedia; along with those on a handful of other sites including CNN, eBay, Digg, and Captcha.net. The system however was not able to decrypt a single red and slanted captcha used by Google or reCaptcha.
Mitchell and Bursztien developed an earlier version of Decaptcha that could crack 50% Microsoft’s audio captchas, a pronounced version of the code to help the visually impaired.
Of course, the researchers have no plans to release the software and explain at length of ways to improve captchas in their paper. But if these researchers can crack this anti-bot system, it is entirely possible someone else less reputable will figure it out too, and before long we’ll be inundated with spam and scams once again.
What’s the most atrocious captcha you have come across? Leave a comment.