What the Web Knows About You
She had me at hello ... or just about. Our conversation had barely started when privacy activist Betty Ostergren interrupted me to say that she had found my full name, address, Social Security number and a digital image of my signature on the Web.
I had set out to discover just how much information I could find about myself online, and Ostergren, who runs the Virginia Watchdog Web site, was my very first call. If this was what could be uncovered in just a few minutes, what else would I find? Quite a bit, as it turns out.
What information is available about you in cyberspace? Where does it come from? What risks does it present and what, if anything, can you do to protect yourself? To answer those questions I decided to use my own identity, Robert L. Mitchell, a national correspondent at Computerworld , as my research subject.
Starting with the information Ostergren had turned up about me, I spent a few weeks combing through more than two dozen public and private resources on the Web and visiting many other Web sites to build a dossier on myself. I conducted both free and paid searches. I contacted a private investigator for tips on my investigation. And I spoke with data aggregators and privacy experts.
I quickly discovered that while the quantity of publicly available information about individuals to be found online is vast, it is riddled with inaccuracies. For example, I changed my primary residence more than a year ago, but many databases online still have my old address. In other cases, the information is just plain wrong.
Having a common name like Robert Mitchell -- or a famous one like Bill Gates -- makes the job a lot harder. While nuggets of information about you can be pulled up quickly, filtering out all of the data that is not actually about you and sorting out what is accurate is time-consuming. It requires a lot of digging.
But I was starting with a key piece of data -- my Social Security number -- and that makes finding relevant data a bit easier. As I gathered more data, I also reran many searches to get different -- and more targeted -- results. Here's what I found and where I found it.
Source: Government records
Information discovered: Full legal name, address, Social Security number, spouse's name and Social Security number, price paid for home, mortgage documents, signature
Much of the publicly available information on individuals online is sourced from online county, state and federal government records databases, and this is where Ostergren found my Social Security number. She hadn't purchased it from a hacker chat room or from shady characters in Russia. She got it by browsing an image of a mortgage document stored in a county database located in a building half a mile from my house.
Over the past five years, bulk scanning and online publishing of such documents have proliferated in many states. In many cases, including New Hampshire -- my state of residence -- little or no attempt has been made to redact sensitive personal data such as Social Security numbers before moving those records online. The public is blissfully unaware that these documents, which were once accessible only in dusty books inside the walls of the registry of deeds, are now freely available over the Web to anyone in the world with a click of a mouse.
Ostergren says that this information is a treasure trove for data aggregators, brokers and criminals. Unlike financial and medical records, which are regulated, Social Security numbers gathered from public records come with no strings attached. They can be republished anywhere with impunity . "You're in a state that is spoon-feeding Social Security numbers to everybody," Ostergren says.
In the county where I live, legal documents from 1975 and on have been scanned and placed for public viewing on the Web. No registration or payment is required to view those records, although there is a charge to print official copies. The database includes thousands of records on New Hampshire citizens, including tax liens, federal liens, divorce papers, financing statements, military discharge papers, death certificates -- even a mobile home warranty. Any legal document filed with the registry is fair game.
In these records I found names, addresses, Social Security numbers, dates of birth, signatures, children's names, educational backgrounds, blood types, work histories and other personal data. Newer mortgage documents no longer contain Social Security numbers (mine was from 2001), but many other documents still do -- including death certificates and tax liens. In my case, fortunately, just one document on file -- the old mortgage -- contained my Social Security number.
Revelations from the rest of my government database searches were less sensational. State and county court documents are public records. In many states, those records are already online and available for public viewing on the Web. New Hampshire's county court records have not been put online, but the state has plans to do so, according to a county official.
Lauren Noether, bureau chief for consumer protection and antitrust at the New Hampshire Department of Justice, says it's just a matter of time before those records are available online. But she is concerned because standards for what information appears in legal documents have changed over time.
"I had an individual call to tell me that their child's name was in [an old] child abuse indictment. Nowadays we don't do that," she says. Noether amended the document, but she worries that bulk scanning and publishing of all historical records would bring many other inappropriate disclosures into public view.
Like many states, New Hampshire has a child sex offender registry . I am not a sex offender, but for the purposes of this story (I am the subject of the investigation, after all) I ran my name through anyway. As expected, I wasn't on the list, but it was chilling to find three other Mitchells listed there.
My next stop was the federal Public Access to Court Electronic Records (PACER) database, which contains U.S. District, Appellate and Bankruptcy court records. Here the government wants to know who is searching. The registration process for users involves entering your Social Security number, date of birth and other data.
I found myself trolling through dozens of records of people who were not me, at a cost of $.08 per page of results. I pulled up a total of 119 records, including 51 Robert L. Mitchell bankruptcies.
Another Robert L. Mitchell had been arrested for kidnapping. But nothing matched the Robert L. Mitchell I was researching.
The PACER system required that I conduct a separate search for each jurisdiction. CriminalSearches.com is a commercial site that aggregates the same information so that you can do a single search across all jurisdictions -- and it's free. I executed a free search on the Web site. Apparently, I have a clean record in all 50 states.
I also searched state and county databases for the state in which I reside. Database aggregators such as LexisNexis pull information from all of the various local, state and federal databases and roll them up for easier searching, but you need to buy a subscription to use such services.
Computerworld has a LexisNexis subscription, but that costs money. While I did fork over $.08 a page for PACER results, that amounted to less than a dollar. At this point in my investigation, I wanted to see how much I could find for nothing -- or next to nothing -- before resorting to fee-based services.
Source: Free people searches
Information discovered: Employer name, job title, age, month and date of birth, phone numbers, wife's name and age, historical addresses and phone numbers, personal e-mail address, identifying photographs, employment history
I continued my investigation with the people and business search Web sites, including ZabaSearch , WhitePages.com , PeopleFinders.com , US Search , Intelius , Switchboard and PublicInfoGuide.com . The initial searches were free, although each service charged a premium for some of the data it uncovered. As I found out, you get what you pay for.
I gathered plenty of data on Robert L. Mitchells, but most of the data wasn't relevant to the Robert L. Mitchell I was investigating. Each search yielded multiple results, including some records with outdated information about me and others with totally inaccurate data. In some cases, aggregated data clearly had been mismatched, which appeared to be the result of mashing together two different Robert Mitchells into one identity.
ZabaSearch pulled up only an e-mail address I don't use and another that no longer exists, but it did find my mailing address, which it displayed on a satellite map. WhitePages.com had my name and phone number associated with a wrong address. Switchboard incorrectly described my home telephone number as unlisted. PublicInfoGuide.com found a residential address but listed four "relatives" that I never knew I had. PeopleFinders returned an address and phone number in another state where I had lived 20 years ago.
In some cases, part of the search results, such as the full address or e-mail address, was deliberately omitted. PeopleFinders located a Robert L. Mitchell in the correct town but wanted $1.95 for the full address. As up charges go, that was cheap: US Search wanted $10 to divulge the full address. I found it unnecessary to pay for these results, since different sites tended to provide different information upfront -- I could piece together all the bits of free information from various sites.
My Computerworld affiliation didn't turn up initially, nor did my business phone lines or my cell phone number. A search at ZoomInfo produced my correct title and Computerworld affiliation, but the work history was a comedy of errors, including incorrect titles and a stint as a PC World contributor that I must have forgotten. Under "Education," the results simply said "MSN dial-up."
Source: Search engines
Information discovered:Age, phone numbers, Computerworld affiliation, Computerworld stories, blog posts, identifying photos, social network and nonprofit affiliations, editorial award
I continued my research with the commercial search engines, including Google , Yahoo Search , Microsoft's Live Search , Dogpile and Vivisimo's Clusty . I used combinations of my name, job title, business name and location, and I concerned myself with only the first few pages of results.
As I encountered new information, I added it to my search criteria and ran searches again and again. The search engines divulged my age, phone numbers, my identities on three social networking sites and dates when I had signed up, my positions with two nonprofit organizations, links to Computerworld stories, blog links, a few snarky remarks about my stories and an announcement that a Computerworld story I wrote won an ASBPE award in 2007.
For good measure, I also searched the Techmeme , Technorati and Computerworld sites directly, assembling a long list of stories I had authored, as well as comments about those stories and contact information.
Source: Image search
Information discovered: Computerworld publicity photos, Flickr photos
Here I stuck with Google Image Search and Flickr . The 429 Google image results included dozens of Robert L. Mitchell photos, but the correct one was buried five screens down in the results. Also, displayed were photos of people whom I have interviewed for Computerworld stories.
Flickr searches on variations of my name produced no photos of me, but I was able to find my account by searching members with the name "Robert Mitchell." On the third screen, my photo appeared next to an account name. By matching that photo with the Computerworld publicity photo, I was able to identify myself.
From there, I was able to view several hundred publicly shared photos associated with that account. But like much of the content on Flickr, those images are untagged. Finding photos of me in the long list was a painstaking process.
Source: Social network search engines
Information discovered: Computerworld stories, blog posts, social network friends and co-workers
With iSearch, users can search for social network content by name or by screen name. A name search on "Robert L. Mitchell" produced the same people search results I had seen before, and searches on all my screen names produced no results. A spokesperson stated that iSearch, a service launched by Intelius last September, was still building up the database for the service.
Delver, another social network search engine, indexes content and ranks its relevance based on what your social network of "friends" have to say about it. It indexes content from MySpace, Blogger, LinkedIn , YouTube, Hi5, FriendFeed, Digg and Delicious, as well as profile data from Facebook. A search on "Robert L. Mitchell" brought up 47,755 Web links. I found no personally identifying information but did find links to stories I have written.
I concluded by searching individual social networking sites. I didn't get much here, but private investigator Steve Rambam, who runs the Pallorium investigative agency in Brooklyn, N.Y., says the amount of self-contributed data available on many individuals is enormous.
"If you have a MySpace page, and Friendster, LinkedIn, Plaxo, Yahoo 360 and Monster.com, and you use Twitter and Flickr, in 90 seconds I'll have your photo, your likes and dislikes, where you live, what you do and so on -- all contributed by you," says Rambam. That search, he says, provides as much information as he used to gather during a 12-month investigation in pre-Web days.
If that sounds scary, the technology also has its limits. "You have the best defense against a casual investigation: a common name," says Rambam. To find people like me on social networking sites requires logging onto each one individually and using advanced search features to try to narrow down the field.
"Even then there are dozens of records that would have to be manually examined," Rambam says. But that just slows him down. "It would probably take a full day to compile a decent dossier on you," he says, while a unique name takes just a few minutes.
Source: Paid searches
Information discovered: Address history to 1985; real estate purchase dates, assessed values and mortgagors; 2004 property tax bill; nonprofit affiliations; Flickr account details; published stories; parents' names, address, phone number and first five digits of Social Security numbers; current and past neighbors' names, addresses, phone numbers, dates of birth and first six digits of Social Security numbers
At this point, I decided to invest a little money to see what premium searches would buy me.
Since no one had come up with my cell phone number, I decided to start small, with a US Search reverse phone lookup -- which means you provide the number and the company traces its owner. US Search indicated that the information was available on my number -- for a fee of $14.95.
I pulled out my credit card and purchased the report. US Search could not find any data initially. The next day it sent an e-mail that attributed the phone to "Josh (last name unavailable)." Address information was limited to a town name, which was incorrect. US Search refunded my money.
I tried other sites, also without success. One possible reason why: I never provide my cell phone number online or use it for business transactions.
Things did not go so well with USATrace.com , which claimed to offer an "SSN Search" background report on any Social Security number for $37.99. I had picked the company at random from a long list of businesses that came up after I ran a Google search on "Social Security number trace."
The company processed my transaction, but I received no report. Over the next few days, several phone calls and e-mails went unanswered. I ended up challenging the charge on my credit card bill -- a process that eventually resulted in a refund from American Express. Caveat emptor.
I then approached Intelius, a bigger name that also provides data to business partners such as ZabaSearch. Intelius waived its $49.95 background search charge for the purpose of this story. I requested a few extra bells and whistles, which would have brought the total cost to $77.
Among other things, the report included searches of criminal records, civil judgments, sex offender records, address history, real estate property records and death certificates. Intelius gets its information from public records, marketing databases and information that is scraped off the Web, says Ed Petersen, co-founder and executive vice president at Intelius. Much of the information is purchased from other data providers.
Inaccuracies in the data and the abundance of data on people who were not me made combing through the 67 pages of results a bit of a chore. After removing the irrelevant content, I was disappointed to find that the report contained just one piece of data that I had not found through my previous, free searches: a June 2004 property tax bill in the amount of $1,857.
Despite the fact that I'd entered my address and Social Security number, the bulk of the report consisted of state and federal criminal records of 156 Robert Mitchells from all over the country, none of which were me. It included incorrect names of "relatives" as well as records with my correct phone number attached to the wrong address and vice versa. It did not find my primary legal residence address or phone number at all. (We moved one year ago.) The business records section of the report did not turn up my position at Computerworld or my business phone number.
Intelius did aggregate a lot of data about me that I had already discovered, and might have saved some research time. However, I would still have had to do additional work to resolve the inconsistencies and other errors.
Next I tried a service called ReputationDefender , which tracks both what is being said about you (the MyReputation service; $9.95 per month) and personal information available about you on the Web (MyPrivacy; $4.95 per month). After a few days, the service uncovered my residential phone numbers, information about my work with a nonprofit organization, details of my Flickr account and a couple of Web sites I set up.
Finally, I tried searching public records through LexisNexis. Computerworld 's subscription includes a search function that combines data from public records databases ranging from motor vehicle records to court documents to hunting and fishing licenses. While much of the information LexisNexis returned was the same as what I'd found previously, it produced more information overall, and data accuracy was somewhat better.
I came away with a listing of past and present neighbors' addresses, phone numbers and partial Social Security numbers and a historical list of my real estate property transactions that included the amount paid, date of purchase and mortgage lender name. I found the assessed value for my residence for the year 1997. Also available: my mother's and father's names, ages, address, phone number and partial Social Security numbers.
While LexisNexis allows voter registration list searches, no information appeared for my name in New Hampshire. Voter registration lists have been consolidated into a central database to meet federal requirements. Currently, that database is exempted from New Hampshire's Right-to-Know Law , but legislators have given the Democratic and Republican parties exclusive access to it, says New Hampshire State Representative and privacy advocate Neal Kurk, a Republican.
"The parties take this information and sell it to candidates, and you can be sure that a disc containing all of this information goes to various marketers or charities or whoever," he says. So far, though, it wasn't accessible to me.
I also could have searched for other, more sensitive data, such as driver's license and motor vehicle registrations, on LexisNexis. Access to that data is controlled by government regulations, but to see it I simply had to pick a "permissible" use (litigation, debt recovery, insurer, etc.) from a drop-down list. While LexisNexis' terms and conditions do state that it keeps track of who has accessed regulated data, as far as I could tell, anyone can conduct a search without any verification of a permissible use claim.
What else is out there?
Did I find everything that was out there? Private investigator Rambam says the information I gathered in a few days of work was just the tip of the iceberg of what is available about individuals online. Rambam runs PallTech , an investigative database service for law enforcement and security professionals. Its 25 billion records on individuals and businesses include aggregated public records, telephone listings, marketing data, and more sensitive, regulated data such as vehicle registrations.
A single query performs 62 different searches and produces an average of 230 pages of results in 90 seconds, Rambam says. He quickly found my Social Security number, driver's license number, vehicle registrations, date of birth, e-mail address and other information.
PallTech's database isn't open to the public, but Rambam says much of the same information is out there for anyone who's determined to find it. For example, I didn't find my medical records or banking records online; both types of information are regulated. But, says Rambam, "Any competent social engineer can get that information. There's just too many places where it's available."
For instance, Rambam says he once tracked down a subject by calling pharmacies near the person's address, posing as the subject and asking if his prescription was ready. He quickly learned both the name of the prescription and the doctor who prescribed it. By calling the doctor's office, he was then able to get the time and date of the subject's next appointment. While all this is illegal (he did it with the subject's permission, as part of a friendly bet) and he says most professional investigators don't do that today, he's certain that scammers use the technique.
I also didn't find my state of birth or mother's maiden name online, but Rambam says that I could have found the information with a little more work. (For example, I didn't think to look on genealogy Web sites.) "The downside to all of this publicly available information is that it's now a lot easier to social engineer somebody," he says. If someone has access to a profile of personal information about you as well as your network of friends, that makes it easier for someone to pose as you to gain access to more sensitive data.
And much more personal information is tucked away in marketing databases, says Rambam. Data aggregators such as ChoicePoint and Acxiom , he says, maintain giant databases of information about individuals for risk management and marketing purposes.
To find out more, I spoke with Jennifer Barrett, global privacy officer at Acxiom , a large data aggregator and marketing services provider in Little Rock, Ark. Acxiom specializes in helping businesses build complete demographic profiles of their customers. It builds large, proprietary data warehouses that match up the client's marketing data on its customers (what they bought) with "intelligence" on those customers (who they are) that includes demographic data, interests, what types of products the subjects like to buy and so on. (For details, see "How much do marketers really know about you?" )
Acxiom and some other data aggregators do allow consumers to request, for a fee, a report summarizing the basic identifying and background screening information that the company has about them in its databases. (Acxiom does not release this information without a signed form and a personal check for $5 with name and address information printed on it that matches the name and address of the subject of the request.) I wanted to find out what details Acxiom had on me, so I made the request (the company waived the fee for the purposes of this story); however, the report I received did not include the full search results.
Interestingly, Barrett cites privacy as the reason Acxiom didn't reveal more of the data it owns about me. Search results often return information on other people who are linked to the subject's data in some way, such as through a common address or phone number. "It divulges details on other individuals and would invade their privacy," she says. But Acxiom does allow consumers to opt out of its marketing databases .
Assessing the risks
Perhaps the biggest risk that accompanies the proliferation of personal information on the Web is the increased danger that the information will be used for identity fraud. Although overall identity fraud has trended down somewhat, 8.4 million people were victims of identity fraud last year, according to Javelin Strategy & Research , which publishes an annual survey report on the subject.
Of the information available about me on the Internet, the most troubling was my Social Security number, blatantly posted online by my own county government, for the convenience of lawyers, insurance agents -- and petty criminals interested in identity theft. Today, you need more than just a Social Security number to commit identity fraud, but a criminal who has that number is off to a great start.
"Various arrest records released by law enforcement have included criminals' confessions of using bulk scans of both paper and electronic records access," says Javelin president James Van Dyke.
While I was able to have my Social Security number redacted from the county Web site record by filling out a form with the Registry of Deeds, there's no telling if that information was already scraped by thieves. (On the plus side, the information from the county database didn't show up on Google or other search sites, probably because it resides in a database and must be queried rather than appearing on a Web page that is easily indexed by Web crawlers.)
Identity thieves can also cobble together Social Security numbers from different sources that publish different parts of the Social Security number as an identifier. For example, subscribers to LexisNexis can find the first five digits of a subject's nine-digit Social Security number, while Acxiom provides the last four digits in its reports (although that's harder to obtain, since Acxiom screens its customers). Federal tax liens use the full Social Security number, and state tax liens use the last four, says Ostergren. Both are publicly available on paper records, and in many cases the data is being republished on the Web.
Once a thief has the number, it can be used to unlock more data about you that can be used for identity theft.
The sheer breadth of information available about individuals online is also a concern. According to Rambam, having access to that much information makes it easier for criminals to obtain other identity authentication factors such as a mother's maiden name.
But others say that even having one or two authentication factors for an individual is no longer a guarantee of success in identity theft. Improved processes and consumer awareness are key reasons why new account fraud has remained flat in the past year, according to Javelin, and faster detection has caused account fraud losses to decrease by 21% from 2007 to 2008.
Barrett says that the number of authentication factors required is on the increase, and varies with the risk involved. Accessing an online subscription to the Wall Street Journal would require fewer authentication factors than would accessing a bank account. In fact, most financial institutions now require multiple authentication factors to open an account -- or even to process an address change. "If there's a high degree of risk it can be seven or eight or nine factors. If it's not it might be three or four. But it's not one or two."
As a test, I called my business credit card company and my bank. The credit card vendor asked for my account number and mother's birth date to access my account. To change my address, I also needed to provide my full name and the credit card's four-digit security code. That's four factors.
When I called my local bank with the same request, the representative asked for my name, middle initial, city of birth and mother's maiden name. (According to a security executive from the bank, representatives may also ask the branch location where you opened the account and how long you've had the account.) The representative did not ask for my account number, and she divulged my current address during our conversation.
But are four authentication factors today really more secure than two were 10 years ago? Four may be the new two. Because so much data about me is readily available online, right out of the gate I had found online two of the four factors needed to change the billing address for my credit card. But I still needed the physical card to determine the card number and security code.
More worrying was the fact that I had tracked down three of the four authentication factors needed to change my address with the bank (which is now reviewing its policies).
While both institutions require four authentication factors, the fact that the answers to some of those "authentication" questions about me are readily available online mitigates their value. In this case, an identity thief is two authentication factors away from cracking my credit card account and just one away from messing with my bank account data.
The banks might do well to increase the number of authentication factors in use -- even though it presents an inconvenience to customers. The challenge will be figuring out what questions to ask in a world where almost everything there is to know about you is publicly available online.
Privacy may be dead, as Rambam likes to say , but individuals can play a role in reducing their information footprint and shaping the information that is available about them. Keep reading our special report for steps you can take to control data about you.