How We Tested Spam Filters

Freelance writer Logan Harbaugh tested most of the filters on Microsoft Outlook 2000; for the filtering software that didn't support Outlook, he used Outlook Express. Each product ran on a cleanly installed system with no software other than the operating system and Microsoft Office installed.

Depending on the product tested, Harbaugh used one of two test systems--one running the Windows 2000 Server operating system, and the other running Windows XP Professional. The hardware was a Blade Server--a slimmed-down rack-mounted computer with a 1-GHz Pentium III processor and 512MB of RAM. Some filters wouldn't install under Windows 2000 Server, so he tested those on the system running Windows XP.

Each product filtered the same 3217 messages, collected over a period of two weeks in March 2004, of which 2135 were spam and 1082 were legitimate messages.

Harbaugh trained the filters, when possible, with the first 1000 messages, adding senders to the whitelist (approved senders) or the blacklist (spammers). This was generally a two-step process: First, he sorted through e-mail that had been flagged as spam, and he identified messages that weren't really junk. Then he read the e-mail that had passed through the filter, and he identified spam that the filter had missed.

Typically, a filter offers a variable setting that determines how aggressively it filters mail. When dealing with a product that gives users a choice of settings, Harbaugh tested it at its default setting. Filter performance also depends on regular updates: When he had a choice, he left the frequency of updates at the default used by the program after installation; this ranged from once every hour to once a week.

The messages used for testing included several types of e-mail that are very difficult for most filters to diffrentiate from spam. Among these were press releases, legitimate bulk e-mail (both marketing and newsletters), mailing-list messages, and product updates from various companies. Much of the e-mail that the test account received came from senders not in the address book, which made things more difficult for some filters.

