Internet Archive, fearful of spying, boosts its encryption
By Zach Miners
The Internet Archive, the online repository of millions of digitized books, wants to shield its readers from other’s prying eyes—like the government’s.
On Thursday night, the nonprofit announced new privacy protections to make it more difficult to see users’ reading behavior on the site, by implementing the encrypted Web protocol standard HTTPS and making it the default. Most users will soon be using the secure protocol, which is designed to protect against eavesdropping and what are called “man-in-the-middle attacks,” the group said. The protections were announced during an event at the organization’s headquarters in San Francisco.
Recent revelations over government surveillance and National Security Agency programs like Prism were a major driver behind the changes. “Based on the revelations of bulk interception of web traffic as it goes over the Internet, we are now protecting the reading behavior as it transits over the Internet by encrypting the reader’s choices of webpages all the way from their browser to our website,” the group said in a Friday blog post, pointing to NSA’s “XKeyscore” tool in particular.
The XKeyscore tool, for instance, lets NSA analysts search through vast numbers of emails, online chats and browsing histories without prior authorization, reports have said.
The Internet Archive also made changes to make it harder to reconstruct users’ behavior on the site, by encrypting the Internet Protocol addresses stored on the servers for Archive.org and OpenLibrary.org. The group modified the servers so that they would encrypt users’ IP addresses with a key that changes each day. The approach, the group said, will allow them to know how many people have used their services, but not who they are or where they are coming from. The Internet Archive claims to have more than 3 million daily users.
Users of the Wayback Machine, which lets people see previous versions of certain sites across the Internet, will also start to see the secure HTTPS version by default.
Web servers typically record IP addresses in their logs, which leaves a record to reconstruct who looked at what, but the Internet Archive has been trying to avoid keeping users’ IP addresses for the past several years, the group said.
With help from more than 15 million users and 850 contributing libraries, there are more than 5 million ebooks freely available on Archive.org and 2 million ebooks on OpenLibrary.org, according to the Internet Archive site.
The Internet Archive also announced several other initiatives, like fixing broken URL links it has archived, and a database of U.S. television news programs.
For the nostalgic, there is also a Historical Software Archive, which will let software from a bygone era, like from Apple’s II computer, run in modern browsers.