Carnegie Mellon researchers Alessandro Acquisti and Ralph Gross say the Social Security numbering system combined with the widespread use of S.S.N.s as an identifying number has created an “architecture of vulnerability,” and is an unexpected consequence of the availability of basic personal information and modern computing power. The study will be presented on July 29 at this year’s Black Hat security conference in Las Vegas.
Acquisti and Gross determined that the problem lies in how Social Security numbers are constructed. Every S.S.N. has three parts: area number (AN); group number (GN); serial number (SN). All three components can be predicted based on the probable location of your residence at the time your S.S.N. was applied for. This is possible since the sequence of ANs and GNs for each state are publicly available online, and SNs are assigned in consecutive order.
The researchers tested their theory of guessing S.S.N.s against the Social Security Administrations Death Master File. The DMF is a publicly available database that lists the S.S.N.s of people who have died.
While the success rate for predicting S.S.N.s was relatively low, the researchers were able to correctly guess numbers nationwide for people born before 1989, 0.08 percent of the time in fewer than a hundred tries.
The simplest numbers to predict, however, were those assigned in smaller states and to people born after 1988. The reason is that as of 1989, Social Security numbers were assigned according to the Enumeration at Birth initiative, where people received their Social Security number at birth. The EAB increased the chance of identifying an S.S.N. dramatically since a person’s birthplace and location at the time the S.S.N was applied for were guaranteed to be identical. In addition, a smaller state population automatically reduces the number of S.S.N.s available making a correct guess more probable.
One striking finding for instance was that the Carnegie Mellon researchers were able to identify one out of 20 complete S.S.N.s in less than ten attempts for people born in Delaware in 1996. The researchers also found they could correctly identify the first five digits of an S.S.N. of anyone in a single try 44 percent of the time for individuals born between 1989 and 2003.
Despite their results, Acquisti and Gross caution that their method of harvesting S.S.N.s could only be imitated by sophisticated hackers. In one such scenario, the researchers discuss how criminals with the right algorithm to guess S.S.N.s for males born in West Virigina in 1991 and a rented botnet containing at least 10,000 IP addresses (zombie computers), could successfully obtain the S.S.N. of as many as 47 people per minute. The circumstances would have to be ideal and run according to a wide range of variables put forward by Acquisti and Gross, but the research does suggest large scale identity harvesting would be possible with just two pieces of basic personal information.
So what is the answer now that the S.S.N. flaw has been proven? Acquisti and Gross argue that the tradition of using your S.S.N. as a personally identifying number for private transactions such as opening a bank account or signing up with a cell phone provider should be substituted for a more secure system of identification.
Using the S.S.N. as a means for personal identification is a procedure the Social Security Administration has warned against for years. However, SSA representative Mark Lassiter told The New York Times that the Carnegie Mellon Research is not a cause for alarm. Lassiter said it would be a “dramatic exaggeration” to suggest the researchers have “cracked a code” for discovering S.S.N.’s. Lassiter also said the SSA will be assigning numbers using a randomization system beginning next year.