A privacy researcher is urging Netflix to cancel its next research contest, before it results in potentially millions of dollars in damages for invasion of its customers' privacy.
On Monday, the company awarded $1 million to the winners of its first competition, aimed at developing technology to improve its ability to predict what movies its customers will like.
Ohm worries the information the company is about to release as test data for the second contest isn't as anonymous as Netflix may think.
According to the New York Times:
"The new contest is going to present the contestants with demographic and behavioral data, and they will be asked to model individuals’ “taste profiles,” the company said. The data set of more than 100 million entries will include information about renters’ ages, gender, ZIP codes, genre ratings and previously chosen movies."
Ohm counters that "researchers have known for more than a decade that gender plus ZIP code plus birth date uniquely identifies asignificant percentage of Americans (87% according to Latanya Sweeney's famousstudy.)"
"True, Netflix plans to release age not birth date, but simple arithmetic shows that for many people in the country, gender plus ZIP code plus age will narrow their private movie preferences down to at most a few hundred people."
Ohm said that even if it is not revealing information tied to a single person, Netflix "is revealing information tied to so few that we should consider this a privacy breach. I have no doubt that researchers will be able to use the [existing reidentification] techniques, together with databases revealing sex, zip code, and age, to tie many people directly to these supposedly anonymized new records."
Netflix could face lawsuits under the federal Video Privacy Protection Act (VPPA), or even a Federal Trade Commission investigation, Ohm predicted. He said Netflix must know its anonymous information can be reconnected to an individual or small group.
"If sued or investigated, Netflix will surely argue that its acts are immunized by the policy, because the data is disclosed 'on ananonymous basis.' While this argument might have carried the day in 2006, the argument is much weaker in 2009, now that Netflix has many reasons to know better," Ohm added.
Netflix' first contest, the result of which was announced on Monday, began in 2006 and sparked research that found supposedly anonymous data could be fairly easily tied to a specific person.
My take: I don't believe Netflix is trying to create a mass violation of its customers' privacy by creating this new competition, which also awards $1 million in prize money. I feel confident Netflix is proceeding in good faith, but that may not be enough to prevent bad consequences.
In 2006, Netflix could not have known how easily its anonymized customer data could be tied back to specific individuals. In 2009, Netflix should know better and be prepared for the consequences, regardless of its good intentions.
Ohm urges the company to kill the new competition before it starts and makes a compelling case for doing so. His point appears valid and Netflix should consider his views seriously.