Quantcast

Maximum Google

Google is fast, accurate, and fun to use, but you can get even more out of the boss of search engines.

Steve Bass

  • 0 Yes
  • 0 No

Search Controversy

The Curse of the Cache: Is Google's Memory Too Good?

After a messy divorce, the last thing Diane K. Jensen wanted to be reminded of was her husband--or the online business that they used to run together from their home in Sarasota, Florida.

Unfortunately, she continued to receive customer inquiries more than six months after she asked her ISP to take down the site. Jensen blames Google, where she can still find parts of her old site preserved in the cache. Google's cache servers store entire Web pages and sites.

"I was going through a very traumatic experience; and each time I got an e-mail, it would remind me of what I was--and still am--going through," she says.

The cache is one of the favorite tools of many Googlers because it lets them browse archived Web pages that may have changed recently or that were hosted on servers that subsequently went offline.

But not everyone wants retired pages to stay in the cache. Weeks after September 11, 2001, Webmasters for the nation's utilities and mass transit agencies removed sensitive Web pages, fearing that terrorists might use the information to attack those facilities. Months later, visitors found the pages in Google and on other caching sites.

When businesses find their copyrighted data illegally published on the Web, it's not enough to go after the site; they must also ask Google to remove any reference to it, including links on other sites, from its cache. Google has received nearly a hundred such requests, according to the Chilling Effects Clearinghouse.

A Google spokesperson says that old pages drop out of the cache every four to six weeks. Jensen's site may have stayed in the cache longer because her Web hosting service removed only a few pages from the site, so the site still looks active.

Caching isn't just a Google headache. In fact, experts cite the Way Back Machine as a bigger problem. That site lets you view snapshots of whole Web sites, year by year, and it never deletes the files that it stores in its cache.

What You Can Do

Recent improvements to the back end of Google's cache make it easier for you to remove your pages from Google's cache, or to prevent their being stored:

--To have Google drop your entire Web site, one or more pages, or just snippets from its cache, visit Google's Remove Content page, where you can learn how to remove pages from the cache or to prevent Google from caching sites.

--To stop a page from being cached, add META NAME="googlebot" content="noarchive" in the HEAD section of the HTML files (see Google's "My Site's Listing is Incorrect" page for more details).

--Another method to eliminate Web pages from Google's cache is to replace the pages you don't want cached with blank pages in the same location and with the same file name. When Google next updates the cache for that page, it will store the blank files.

  • Recommend this story?
  • 0 Yes
    0 No

Print 65% more pages than with refilled inks. Trust Original HP Inks. Hit Print Reliably.

Featured APC Accessories For Your System
10% Off Entire Cart at Online Store

  • APC Back-UPS ES Safeguards your equipment from damaging surges and spikes that travel along your utility & data lines.
  • APC SurgeArrest Performance Highest level of protection for your professional computers, electronics and connected devices, as well as provides surge protection.

Focus on Personal Productivitysponsored by Microsoft

  • Personal Finance 2.0 These free and fee-based Web services not only aggregate data from your online bank accounts, they give you tools for managing your money.
  • High-Tech Travel Tips Plenty of stories provide advice for elite mobile professionals. But what about you, the unproductive traveler?

People who read this also read:

PC World's Marketplace