Web sites

Algorithms That Rule the Web

Google helps us think, Facebook finds us friends, and Pandora plays our own personalized soundtrack. It's hard to say whether the computer algorithms that these services use to anticipate our needs and wants are turning us into puppets or geniuses. But algorithms have a huge impact on our tastes, buying habits, and decisions about our digital lives.

Back in the 20th century--the primordial age of algorithms--life was simpler and harder at the same time. We never knew what else we might want to buy at Amazon; we didn't know what the most "important" news stories of the day were; and before the Netflix movie recommendation engine, we had no mechanized assistance in determining which DVD to rent next.

When we're looking for something online, Google's algorithm frees us from having to sort and search through multitudes of only not-very-relevant results. On the other hand, algorithms might trap us in a world where advertisers and government agencies couple behavioral data with computer formulas to predict and manipulate what we do or buy next.

The technological trend toward ever-more-sophisticated algorithms isn't limited to situations where consumers seek information or products. Private companies and government agencies are also harnessing the power of algorithms to boost their efficiency in dealing with inventory control and their effectiveness in monitoring behavior and predicting what a cybercriminal's next move might be.

For algorithm nerds, the Internet is a Candyland of data to model and predict behavior with. Tracking IP addresses across the Net, knowing what websites people visit and when they visit them, counting banner ad clicks, and harvesting data from social networks--all are much easier than following someone around with a clipboard all day.

Here is a look at some of the algorithms that rule the Web--and those who use them.

Google Search

Google search
Google searches incorporate an algorithm for producing relevant results.
Many people credit Google's search algorithm as the source of the company's $193 billion market capitalization and tight grip on the search engine market. As Steven Levy pointed out in a 2010 article on Google "[Google] is still the only company whose name is synonymous with the verb search."

Soon we might start using the verb google instead of think: "Let me google it over before I make up my mind." New research suggests that Google's algorithm could be changing the way we think. Columbia University researcher Betsy Sparrow says that search engines like Google are altering human thought patterns, causing people to remember less on their own and to rely instead on their ability to find the answer on the Internet.

The News Algorithm

Google constantly updates its news algorithm and uses it to power such popular services as Google News. If you're curious about what the day's top story is, you don't have to consult the editors at the New York Times; instead you can see what Google's algorithm considers the top story of the hour at Google News or Yahoo News.

Google news algorithm.
Google bases its listing of the top stories of the hour on a complicated news algorothm.
Google News bases its assessment of what constitutes important news on a long list of article attributes including keywords, originality, freshness, quality, expertise of source. Jim Barnett, who writes about journalism for Harvard's Nieman Journalism Lab, wonders whether investigative and explanatory journalism will be steamrolled by a news algorithm that consistently favors the freshest, most popular content.

Barnett sums up his views in a Nieman Journalism Lab article: "When we don't know what we want, sometimes what we really need is to figure it out for ourselves."

The Social Algorithm

Facebook's social algorithm can help you find old high-school friend and past coworkers, of course. But it does more than find friends and determine whose Facebook updates appear in your Facebook Top News Feed. The algorithm is called EdgeRank, shown below in an image from the TechCrunch website. EdgeRank uses a combination of such factors as your affinity with someone, the type of message (Like, Comment, or Tag), and when the post was made.

Facebook's EdgeRank social algorithm.
Facebook's EdgeRank social algorithm.(Image: Michael Bernstein)
Facebook recently upped its algorithm ante by tying it to facial recognition software to analyze every photo you upload to the service--including the one from last weekend's beach party. Overall, Facebook's 750 million users have uploaded some 20 billion photos. When you upload your photo to Facebook, the service uses facial recognition software coupled with your immediate and extended social circles to identify who is in the picture; then it asks you whether you want to tag (identify) the people in the image. It's your choice to tag or not, but that fact hasn't quelled privacy activists' concern over the feature.

Algorithm Booms and Busts for Business

Online algorithms.
Many companies have developed business models around displaying ads on pages of low-quality content customized to rank high in Google News or in Google's main search results. The effectiveness of these so-called content farms in exploiting Google's ranking algorithms caused Google in February to adjust its search algorithm to reduce the standing of low-quality sites within its search results.

Recently the New York Times exposed JC Penney's efforts to inflate its Google page rank by creating thousands of third-party links and sites dedicated to boosting the company's visibility in Google search results. JC Penney denied any direct knowledge of the shenanigans, but Google penalized it by reducing the company's prominence within its search results.

Smaller businesses may succeed or fail depending on the whims of Google's search algorithm. In 2006, California-based KinderStart.com sued Google in federal court, claiming that it sustained significant financial harm when Google changed its algorithm and subsequently ranked the site low in its search results. Ultimately, KinderStart.com lost its court battle, as have other companies that have filed similar suits against Google.

Next: Love, Shopping, and Music

The Love Algorithm

Chemistry may be the decisive factor in the human phenomenon of falling in love, but algorithms provide the matchmaking spark for many who use online dating services such as eHarmony and
Match.com. Finding the perfect je ne sais quoi for potential lovebirds requires sites such as Match.com to number-crunch users' personal-attraction tests to look beyond the bare-bones facts that someone is a "Jewish nonsmoker who likes swing dancing."

At eHarmony, the service plugs your answers to the site's 258-question personality test into company's ultimate trade secret: its love algorithm. In a 2008 article in the New York Times, eHarmony said that 19 million people had taken its personality test, and a study it commissioned concluded that it was responsible for 2 percent of all U.S. marriages in 2007.

The Perfect Online Ad Algorithm

Online advertising sits at the crossroads of commerce and algorithm deployment. Its objective is to display the right ad to the right person at the right time. An advertising algorithm that succeeds in this mission can mean the difference between a sale and no sale. To better their odds, advertisers use algorithms to slice and dice a complex mix of data.

Targeted ads.
The algorithms are so byzantine that they can be very difficult to grasp. I have examined them in several articles including "Good-Bye to Privacy?" and in a point-counterpoint editorial, "Privacy Backlash Over Ad Tracking Debated." In a nutshell, sophisticated online advertisers pair offline demographic data about you with your Web surfing habits in order to entice you with targeted online ads.

Some observers argue ads that profiling you and presenting you with relevant ads based on your surfing habits help Web content owners stay in business and deliver high-quality content. Others say that trusting private companies with massive databases of user profiles is like putting foxes in charge of henhouse security.

Shopping and Recommendation Algorithms

Online algorithms.
Does Amazon's recommendation engine have you figured out? Probably.

Amazon's algorithm objectively analyzes the buying patterns of millions of customers. Then, if you buy the book State of Wonder by Ann Patchett, Amazon recommends other books based on the titles that other buyers of State of Wonder have purchased. As a result, Amazon may be able to sell you something else that you hadn't intended to buy.

Shopping algorithm.
Recommendation engines enable e-merchants such as Amazon to sell billions of dollars worth of merchandise by helping consumers find what they're looking for and by fostering impulse buys. In an interview with CNet, an Amazon spokesperson said, "Algorithms are what make our site run, [and] such a unique place to shop."

In 2009 Netflix doled out $1 million in prize money to a group of statisticians known collectively as BellKor's Pragmatic Chaos for their success in boosting the movie rental company's accuracy at predicting the movies that customers would like rent. To earn the prize, they had to consider demographic and behavioral data along with zip codes, genre ratings, and 100 million movie ratings.

Pandora Decodes Music

The music service Pandora has demonstrated uncanny accuracy in matching a music listener's tastes based on a single song--and once again, an algorithm is responsible. Pandora's Music Genome Project has as its goal to "capture the essence of music at the fundamental level," according to Tim Westergren, founder of the Music Genome Project and cofounder of Pandora.

Pandora's algorithm.
A cartoon view of how Pandora's algorithm works. (Image: XKCD.com)
Pandora says that it uses 400 attributes to describe a song. Next, according to the description on Pandora's Facebook page, the service's algorithm parses that data from one song and can "play a range of music that is 'musicologically similar' to your starting points in some way--but not always necessarily music that 'sounds similar.'"

While competing online music services have faltered in recent years, Pandora claims a growing user base of 100 million registered users, of whom 36 million are active users.

The Death of Serendipity

Do algorithms signal the death of serendipity--if not free will and privacy--online? The debate will rage for years. But as we quickly morph into the supercomputer age of gargantuan databases, algorithms need to be protected from exploitation by Orwellian governments, sociopathic hackers, and intrusive companies, privacy experts warn. Unfortunately, few laws have caught up with technology in this area.

Current do-not-track legislation is working its way through Congress, and other initiatives put forth on the state level, such as in California, tangentially address how companies gather and use data. But so far the lords of the algorithms have the upper hand.

Enjoyed reading this story? PCWorld's algorithm thinks you might also like to read these:

• "Do-Not-Track in Chrome and Firefox: Different Approaches, Same Fatal Flaw"

• "Facebook Privacy: 10 Must-Know Security Settings"

Subscribe to the Today in Tech Newsletter

Comments