Cornell Software Learns How to Spot Fake Online Reviews
Unless you're really gullible, you must assume that some online user reviews on sites like Amazon and Yelp are fake. But can you spot them?
Researchers at Cornell University believe they can. They've developed software that recognizes fake online hotel reviews with 90 percent accuracy, CNet reports. Humans, by comparison, only guess correctly about half the time. (Try it yourself. The image here contains one real review and one bogus review; the answer is at the bottom.)
The Cornell crew trained the software to recognize patterns in reviews that were known fakes. The software compared 320 gushy hotel write-ups produced by Amazon Mechanical Turks to 320 reviews that appeared on TripAdvisor, where writers are required to have booked their trips through the site. Turns out, fake reviewers tend to use more verbs and less punctuation, and they focus more on family activities than the actual hotels.
Once the software was trained, researchers tested it on the remaining reviews in their database -- 80 real, and 80 fake. Nearly nine out of 10 guesses were correct.
The research could help solve a fundamental dilemma with online user reviews: People trust and rely on what other consumers write online, even when they suspect that fake reviews are in their midst. And when it comes to spotting fakes, even skeptics are better off flipping a coin (although there are methods to getting the most out of user reviews in aggregate). We need a more reliable way to sniff out user reviews, because scandals where perpetrators get caught are the exception, not the rule.
The Cornell researchers are working on it, moving from hotel reviews to restaurants and eventually products. If the software ever gets used in the real world, they'll also have to deal with the possibility of review spammers adjusting their prose to avoid detection, turning this whole exercise into a game of cat and mouse.
But any help from software is better than nothing. On that note, the review on the right in the above image is bogus. How'd you do?