Yahoo! had given little notice that it would be shutting down Geocities hosting, which made the prospect of creating an archive difficult. Web historians and archivists at Textfiles.com believed the potential loss to be considerable and mounted a concerted effort to make a complete backup of all public Geocities sites. To quote Textfiles:
“What we were facing, you see, was the wholesale destruction of the still-rare combination of words digital heritage, the erasing and silencing of hundreds of thousands of voices, voices that representing the dawn of what one might call “regular people” joining the World Wide Web. A unique moment in human history, preserved for many years and spontaneously combusting due to a few marks in a ledger, the decision of who-knows for who-knows-what.”
By using a bit of creative hacking used to forge a “user agent”–the bit of data that tells a server the method (e.g. Web browser, search engine bot, RSS reader) by which pages are accessed–textfiles.com volunteers made it appear if Geocities was being indexed by Google. Rather than simply indexing, the volunteers were scraping and storing all the data available on the Geocities servers to create a single, massive archive.
With the backup effort underway, the question remained: What should be done with data? The first task was to create a single archive, which would then be be compressed and released to the web via BitTorrent! Yes, that means you too can own a (rather large) piece of Web history, if you’ve got the storage space to spare. Compress, the entire archive will likely clock in at over 900-gigabytes. The archive is currently being compressed; the BitTorrent release should be available within a few days. Check out ascii.textfiles.com for details about the release!