How good are you at sizing someone up? Perhaps not as good as a computer, according to researchers at the University of Cambridge and Stanford University.
By analyzing someone’s likes on Facebook, statistical modeling software could characterize a person’s basic personality with an accuracy rivaling that of a spouse or close family member, according to the researchers.
Analyzing such digital footprints could help computer programs to interact with people in more meaningful ways, according to the scientists, whose findings were published Monday in the Proceedings of the National Academy of Sciences.
Although big data systems already help to analyze human behavior, particularly around buying habits, today’s statistical modeling techniques tend to be very narrow in scope. “It is difficult to find the meaning behind these predictions,” said Michal Kosinski, a Stanford researcher and co-author of the study.
“The work we’re doing is helping to interpret those predictions. It allows you to put the meaning to the prediction,” he said.
Kosinski acknowledged that humans “are extremely good at predicting personality traits,” but computers could be even better.
The researchers’ work is a follow-up to a March 2013 study which showed that personality traits can be determined with surprising accuracy by analyzing Facebook likes.
A “like” is Facebook’s jargon for showing approval for an item, such as a photo or article, posted on the social networking service.
With enough likes to analyze, computers can infer basic personality traits, the earlier study concluded. The researchers now wanted to see if computers could size people up more accurately than humans.
They sampled Facebook pages from 86,220 volunteers, many of whom also filled out a 100 question personality survey focused on five major psychological traits: openness, conscientiousness, extraversion, agreeableness, and neuroticism.
A few rounds of machine learning were used to associate the traits with additional Facebook likes. For instance, those liking “Salvador Dali” or “meditation” appeared to possess a high degree of openness.
To judge the effectiveness of the computer algorithms, researchers gave questionnaires to friends and relatives of some participants. The survey results and computerized assessments were then compared with the self-assessments from the subjects.
With just 10 likes, the computer would know someone as well as a work colleague. With more than 70, it would get to the level of a friend or roommate, and with more than 300 to the level of a spouse or close relative.
The study is notable because of its large sample size, said Jennifer Golbeck a computer scientist at the University of Maryland, College Park and the director of the University of Maryland Human-Computer Interaction Lab. Golbeck was not involved in the study, though she is one of a growing number of researchers studying how to predict personality traits through online footprints.
“It was such a large base of users, it suggests that what they were able to find wasn’t just a fluke because they had a small amount of data,” Golbeck said.
“It may not be that the specific correlations would work on a large population. But it certainly seems true that the general methodology to make the connections between personal attributes and the way we behave online is really promising,” Golbeck said.
Facebook serves well for such analysis, but such digital footprints can be found elsewhere online too, Kosinski said. Public forums such as Twitter, usage statistics of what type of music or movies are streamed, or even a company’s Web server logs, could provide a basis for further user analysis.
Online companies already scrutinize such footprints, though the types of conclusions they draw tend to be limited, Kosinski said, adding that further data about people could identify more general traits.
For example, the publisher of a site about Salvador Dali already knows that frequent visitors are art lovers. But the additional insight that they may also be more open to new ideas than the population at large could help guide site development decisions, such as re-designing it in a more experimental style.
Of course, the use of such data can raise privacy concerns. Kosinski said Facebook could use such approaches to infer the personalities of its users, if it hasn’t already.
And while not all companies have the range of user data that Facebook has, it would be pretty easy to cobble together different sources of public and proprietary data to form the basis of a personality analysis that goes beyond the profiling that online advertisers have done for years, Golbeck said.
The work should raise awareness of the kind of insights that companies can glean about their users. “There is some really scary stuff that could happen with this too,” Golbeck said. For instance, your credit scores or insurance rates could be affected by traits that providers infer about you, rightly or wrongly.
Just as behavioral analysis could be used for unscrupulous ends, it could also be used for good—to adjust applications to better fit users’ needs.
“This potentially has promise to uncover more of those serendipitous things, where I see stuff that I would never be able to find myself, but because the system knows so much about me and has access to what a billion other people are doing, it can look for tiny little things in the data I couldn’t find myself,” Golbeck said.
The researchers saw no major barriers to scaling up their algorithms to identify personality traits for billions of users, without too much computational heft. It could even be done in near real time, providing a personality profile in milliseconds.
“You can run predictions for very huge populations in no time whatsoever, with very little cost,” Kosinski said.