Study: Social Security Numbers Are Predictable
Social Security numbers may not be as random as believed, as a new study contends that powerful mathematical techniques combined with open-source research can, in some cases, reveal a person's secret number.
The study, published on Monday in the journal Proceedings of the National Academy of Sciences, serves as a stark warning that SSNs are increasingly vulnerable, putting more people at risk of identity theft.
"Unless mitigating strategies are implemented, the predictability of SSNs exposes them to risks of identity theft on mass scales," the study said.
The study comes from Carnegie Mellon University's Alessandro Acquisti, an assistant professor of information technology and public policy, and Ralph Gross, a postdoctoral researcher.
Gross and Acquisti developed an algorithm that analyzed data from the Social Security Administration's Death Master File, a public database of some 65 million Americans who have died and their SSNs, which is used for antifraud purposes.
They looked for numerical patterns in the deceased's SSNs, drawing correlations between where a person was born and their birth date and how that data relates to their SSN.
"Our prediction algorithm exploits the observation that individuals with close birth dates and identical state of SSN assignment are likely to share similar SSNs," they wrote.
The first three digits of an SSN is an area number, which is based on the Zip code of the mailing address provided when a card was applied for. The next two digits is a group number, which assigned in a "precise but nonconsecutive order between one and 99." The last four digits is a serial number.
The algorithm, which the authors did not detail, successfully ascertained the first five digits for 44 percent of the records in the Death Master File for people born between 1989 to 2003. The complete SSN could be picked out for 8.5 percent of those people in under 1,000 attempts. For people born between 1973 and 1988, the algorithm could predict the first five digits for 7 percent of those in the Death Master File.
"SSNs were designed as identifiers at a time when personal computers and identity theft were unthinkable," the study said.
Other changes in how the Social Security Administration assigns numbers have made guessing even easier. In 1989, the agency stated a program called Enumeration at Birth, assigning SSNs to newborns as part of the birth certification process.
The changes, however, increased the correlation between a person's birth date and all nine digits of a SSN, especially for people in less populated states, making SSNs easier to discover, the researchers wrote.
Additionally, the proliferation of information on social-networking profiles, such as a person's hometown and birth date, puts people at greater risk, since that information could be used to infer SSNs.
"Such findings highlight the hidden privacy costs of widespread information dissemination and the complex interactions among multiple data sources in modern information economies," the researchers wrote.
Attackers could then take the SSNs they think are accurate and run them through credit approval services. Even though many of those services will limit the number of attempts to verify data, botnets could be employed to test vast numbers of SSNs to ensure they're valid, they wrote.