IBM Unveils Web Privacy Work
System scrambles user information, then applies algorithms to generate customer data models for merchants.
Ann Bednarz, Network World Fusion
Researchers at IBM's Privacy Institute are working on software that automatically scrambles Web visitors' personal information--so consumers perhaps won't feel compelled to lie just to protect their privacy.
It's no secret that online visitors often provide false personal data to avoid any repercussions should the data be misused or shared with multiple sources. For merchants, that means the customer data they painstakingly track with customer relationship management software--and often rely on when making product development and marketing decisions--can be flawed from the start.
To help solve this problem, researchers Dr. Rakesh Agrawal and Dr. Ramakrishnan Srikant are developing what IBM calls "privacy-preserving data mining." The duo's research, which IBM announced Thursday, relies on the notion that a Web visitor's personal data can be protected if it is scrambled, or randomized, before it gets to the merchant. Once the data is transferred to the merchant's systems, the IBM software applies algorithms to compensate for the data scrambling. With this technology, a retailer could still generate accurate data models and extract useful demographic information, but without ever seeing personal consumer data, IBM says.
"Our research institutionalizes the notion of fibbing on the Internet, and does so to preserve the overall reality behind the data," Agrawal says.
Numbers Game
When a Web user enters a piece of personal data, such as age, salary, or weight, the IBM software immediately scrambles that number by adding to or subtracting from it a random value. This randomization step is performed independently for every user, IBM says. This means a 30-year old's age may be changed to 42, while a 34-year old's age may become 28.
The merchant determines the range of the randomization--plus or minus 1 to 12 years, for example--which then remains constant. Once all the scrambled data is collected for a large number of users, IBM's data-mining software determines how the true data might have looked like and uses the reconstruction to build a data-mining model, IBM says.
The greater the range of number-scrambling that is allowed, the more consumers' private data is obscured. However, as randomization parameters increase, the accuracy of the post-scramble data-mining results decreases. According to Agrawal, it's a tradeoff. IBM says that in its experiments, after compensating for the data scrambling, it found only a 5 percent to 10 percent loss in accuracy, even with 100 percent randomization allowances.
The research project is underway at IBM's Privacy Institute; beta trials will begin soon. It's the first project announced by the Almaden, California group, which was formed in November 2001.
These days, Internet privacy is a hot topic, most recently making headlines when U.S. Senator Fritz Hollings introduced a controversial bill designed to safeguard Internet users' privacy, and which opponents suggest will hamper online commerce.
For more information about enterprise networking, go to NetworkWorld. Story copyright 2008 Network World Inc. All rights reserved.
The Best of PC World
Acer Laptop Center
- Great year-end deals

for small business! -
Get 24/7 live remote AT&T Tech Support 360* service along with select Lenovo* PCs (with Intel® Core™ 2 Duo processors) and save up to 200!
-
HP EliteBook* 6930p Notebook with Intel® vPro™ technology and a free HP Basic Docking Station - $641 instant savings!
- *Other names and brands may be claimed as the property of others. ©2009 Intel Corporation. Intel, the Intel logo, vPro and Core trademarks of Intel Corporation in the United States and other countries. All rights reserved.
Dell End of Year Deals
-
Ring in the New Year with Huge Deals on Dell Computers
Up to 30% Popular Dell Laptops, up to 25% off Popular Dell Desktops. Sales ends 12/31 5AM EST.
People who read this also read:
Best Prices on Antivirus Software
Norton Antivirus 2010 (Full Product, 1 User)Price: $17.90
Anti-virus 2010 (OEM Product, 1 User)Price: $21.58
Norton AntiVirus 2009 (Full Product)Price: $16.89
AntiVirus Plus 2010 - 3 Users (Full Product)Price: $19.96
AntiVirus 2010 (Full Product)Price: $24.95
Norton Antivirus 2010 (Full Product, 3 Users)Price: $38.50
- Perfect Printing Solutions Find just the right All-in-One printer for you from HP. Visit the HP Resource Center.
- Lenovo Laptop Showcase Find out how Lenovo IdeaPads and Thinkpads balance performance and portability. Visit the Lenovo Resource Center for more info...
Cameras
Camcorders
Cell Phones
Components
Desktops
HDTV
Home Theater
GPS
Laptops
Monitors
MP3 Players
Networking &
Printers
Storage






