Internet Tips: Purge Unwanted Spam the 18th-Century Way

In February, I recommended Cloudmark's free SpamNet spam-blocking add-in for Outlook and Outlook Express. But when the company began charging a premium fee of $4 per month ($48 per year) for its still-worthwhile service, I began seeking a free spam fighter. I got lucky. And so can you.

Last year, software designer Paul Graham described a superior spam-blocking method based on the probability theorem of 18th-century mathematician Thomas Bayes. This technique uses observations of known outcomes to update estimates of the likelihood of a hypothesis. Lots of spam-hating programmers are now creating their own Bayesian spam filters. The first one I tried was the free, open-source SpamBayes plug-in for Outlook 2000 and 2002, and it turned out to be a winner. If you use Outlook, drop everything and get SpamBayes.

After I installed the 3.2MB download and set it up to toss spam, it began learning pretty much on its own what I consider to be spam, and what I know is not spam (see FIGURE 1

FIGURE 1: Filter spam for free in Microsoft Outlook using the SpamBayes plug-in's Bayesian techniques.
). Other antispam techniques rely on centralized databases of known spam messages, or on static rules that scan messages for words or attributes often contained in spam. Bayesian filtering differs by scanning the message's content, then computing the probability that it is spam compared to other messages already identified as spam or nonspam. Messages that have a spam probability of, say, 90 or higher on a scale of 0 to 100 are moved over to the Spam folder. Messages that score 15 or lower stay in the in-box. Messages that fall in between go to an Unsure Spam folder where you determine their fate manually. (You create these folders in Outlook and then configure SpamBayes to use them.)

After I used SpamBayes for a week or so, the program's database could detect about 99 percent of my incoming spam, with very few messages ending up in the Unsure Spam folder. And using the default settings, it never once mistakenly moved good mail to the Spam folder.

I have yet to discover a free Bayesian-filtering antispam tool that integrates into Outlook Express the way SpamBayes does with Outlook. If you want an antispam plug-in for OE, a commercial program like Sunbelt Software's IHateSpam may be your best option for now (Read more about IHateSpam at "New Spam Fighters: Smart and Effective"). But you can still use one of several free Bayesian proxy filters that scan incoming mail for spam before it reaches your e-mail program. The drawback of a proxy is that it requires more configuration and manual training than an integrated tool. On the plus side, proxies work with lots of different e-mail clients.

Stata Labs' free 1.53MB SAproxy download is based on the open-source SpamAssassin spam-filter engine. It works with any POP3 mail account and e-mail application. More important, it came out on top in Consumer Reports' August 2003 antispam software tests, outperforming eight commercial spam filters.

Another antispam proxy I tried, Michel Kramer's free, open-source Spamihilator, simplifies training its Bayesian filters by allowing you to pop up a list of received mail and click Spam and Non-Spam buttons.

And both Mozilla 1.4 and Netscape 7.1 mail programs come with their own integrated Bayesian antispam settings.

