Bush Database Plan Raises Privacy Concerns
President George W. Bush's plan for a massive antiterrorism database center, announced in his state of the union address last week, could be up and running within months, from a technology standpoint. But harder to overcome will be privacy concerns of a nontechnical nature, experts say.
The U.S. government could quickly put in place the first phase of a terrorist-tracking data-mining system by using commercial data-mining software already available, says Allen Shay, president and chief operating officer of NCR Government Systems' Teradata Division.
"They'll take the first, let's say, 15 or 20 databases that are most critical and put an initial system capability in place, and that can be done in a matter of a few months, rather than years," Shay says. "What the government's trying to do now is something that the commercial world was forced into years ago. It's not only doable--it's been done by commercial companies for the last ten-plus years."
Other data-mining experts recommend a system built from the ground up, which would take a year or longer. No matter what the launch date and what technologies are to be used, the Bush plan is already attracting opposition from privacy groups and could run into congressional roadblocks, even though the new proposal seems to be a less ambitious data-mining effort than
Bush, on January 28, proposed a Terrorist Threat Integration Center that would "fuse and analyze" data from several federal departments, including the new U.S. Department of Homeland Security, the U.S. Federal Bureau of Investigation, and the U.S. Central Intelligence Agency.
Minimal information on the center is available, except for an eight-paragraph fact sheet available on
"Right now, everything is under discussion," the spokesperson says.
The center's data-mining component, however, seems focused on pulling information together only from government databases. In that sense, Bush's proposal may be different from the
The Bush plan seems to be a new twist on the old bait-and-switch sales tactic, says Lee Tien, senior staff attorney with the
"Are we seeing here a commitment by the administration to the kinds of data-mining fishing expeditions that we associate right now with Total Information Awareness, but [packaged] somewhat differently?" Tien asks. "TIA is sort of an easy target, because its announced and declared purpose is so all-encompassing ... and then you hit people with something much more limited, and they say, 'Compared to TIA, that's not so bad.'"
The amount of data mined, or where it's mined from, isn't the main concern, Tien says. The bigger issue is what's done with the results of the data, how people are identified as suspects, and how those people singled out can dispute the results if the CIA falsely identifies them as suspected terrorists.
"How many people are going to be labeled in that 'maybe-maybe not' category, and what does that mean?" Tien asks. "Does that mean that every time they show an ID they're going to be treated a little differently?"
The EFF and other groups want congressional and public scrutiny of the Bush center. The goal of the center seems to be to help domestic intelligence gathering, or "domestic spying," says Marc Rotenberg, president of the Electronic Privacy Information Center. With the CIA potentially involved in domestic spying, some congressional oversight will be needed to protect U.S. residents' Fourth Amendment rights against unreasonable searches, he says.
The Bush plan, depending on its scope, could also run into opposition from some members of Congress. Though many members of Congress are waiting to hear more about the antiterrorism center before commenting on it, an amendment to one spending bill, passed by the Senate, would limit TIA and other government data-mining efforts to intelligence-gathering efforts outside of the United States. That amendment was not passed in the U.S. House of Representatives version of the spending bill; for it to move forward, House members would have to approve the amendment during conference committee negotiations over the differences between the two versions of the spending bill.
Senator Ron Wyden, the Oregon Democrat who sponsored the amendment, said he'd oppose the Bush center if it also focuses on U.S. citizens not suspected of being terrorists. Wyden proposed a database of known and suspected terrorists, what he calls the "Terrorist Identification and Classification system," several months ago, and he said he'd support the Bush center if that's what it does.
"A vigorous response to terrorism is necessary, but a system designed to spy on Americans in America is not," Wyden said Wednesday. "I will tell you, if someone tries to take the guts of the TIA program and simply transfer it to the new center, I will do everything I can to stop it."
While EPIC's Rotenberg says the Bush center seems more limited in scope than TIA, EFF's Tien is more concerned about the Bush plan than he is about TIA. His fear is that it could launch fairly quickly, with little debate.
Data-mining experts say the center will not be easy to create: Tying together several government databases in a data warehouse and writing algorithms to search the data will be huge tasks.
Teradata's Shay recommends a commercial, centralized enterprise data warehouse approach over a distributed database approach, but Michael Piovoso, an engineering professor at Pennsylvania State University's Great Valley graduate school, suggests that a system built from scratch could better serve the CIA's specific needs. That process would take a large team of people at least a year, Piovoso says.
"I'm sure it's doable, but it's a huge undertaking," Piovoso says. "One of the problems is, there's a lot of things you want to do with that data that people don't typically do."
Like Shay, Naren Ramakrishnan, a computer science professor at Virginia Polytechnic Institute and State University, recommends a phased-in approach where the CIA would start small and take "simple steps."
Eventually--Shay predicts in a year's time--the government would build the data-mining center to the point where it's predictive, pushing out results to analysts. The system itself would raise red flags, telling analysts, "something fishy seems to be happening here," Ramakrishnan says.
Unlike many commercial data-analysis tools, the government's antiterrorist system would need to react quickly to massive amounts of data, he notes. "You don't have time to react to data for 20 hours, you have to act on it."
The potential privacy problems don't bother Shay. The data is likely to be closely held by the CIA, he says, and the technology would simply tie together information that already exists on government databases, unlike the TIA program.
"I think [TIA] is a much more intrusive kind of issue--and it's also a very far-out-there technology, and one that I don't think is going to see the light of day anytime soon," Shay says. "I think what the president is talking about here ... will be able to be effective very quickly. I think that's what the American public wants. They want something that can be stood up quickly and can be effective in addressing the problem."
But data-mining experts Ramakrishnan and Piovoso agree that such a data-mining system could raise privacy concerns. "Certainly there is the potential for abuse," Piovoso says. "You're putting a lot of faith in government, that it's not going to abuse that power.
"It's a sad situation, in my opinion," he adds. "We're asked to give up some of our freedoms in order to gain more security--and one of the dangers of that is, you may never get [them] back again."