A group of researchers from the U.S. Naval Academy has developed a technique for analyzing email traffic in real-time to identify spam messages as they come across the wire, simply using information from the TCP (Transmission Control Protocol) packets that carry the messages.
This approach could be a useful addition to the arsenal of today’s spam-fighting techniques, observers argue, in that, unlike other typical spam fighting approaches, the content of the email does not have to be scanned.
The work “advanced both the science of spam fighting and … worked through all the engineering challenges of getting these techniques built into the most popular open-source spam filter,” said Massachusetts Institute of Technology computer science research affiliate Steve Bauer, who was not involved with the work. “So this is both a clever bit of research and genuinely practical contribution to the persistent problem of fighting spam.”
Researchers Robert Beverly, Georgios Kakavelakis and Joel Young built a plug-in for the SpamAssassin mail filter, called SpamFlow, that incorporates their analysis techniques. They presented their work at the Usenix Large Installation System Administration (LISA) conference arlier this month in Boston.
In the paper that accompanied the presentation, the researchers showed that spam email blasts have certain characteristics at the networking transport layer. Signal analysis of factors such as timing, packet reordering, congestion and flow control can reveal the work of a spam-spewing botnet. “A lot of spam comes from spambots, which are sending as fast as they can and congesting their local uplink,” Beverly said. “So you can detect them by looking really hard at the TCP stream.”
Thus far, earlier techniques developed for analyzing spam at the network transport layer have been offline, which is to say, the email traffic is analyzed as a batch, and the results can be used later. The naval researchers have developed an architecture for analyzing network traffic as it comes over the wire.
For the implementation, they used the the SpamAssassin email filter. SpamAssassin has a plug-in architecture for incorporate new filtering techniques. “We have a daemon that captures all the packets and looks timing and other congestion characteristics of the traffic stream,” Beverly said. The plug-in can learn to identify and detect spam without human intervention. In tests, SpamFlow was able to correctly identify spam over 95 percent of the time, after a reception of 1,000 emails.
The ability to detect a spam message without actually examining the contents of the message would be handy in a number of situations, noted Bruce Davie, a Cisco fellow and visiting lecturer at MIT. Davie is familiar with though not involved in the work. An Internet service provider could apply the detection algorithm without violating users’ privacy. It can be used to detect messages that are encrypted, such as those traveling over an encrypted link. It can also be used to detect other forms of malicious traffic, such as port scans from botnet hosts.
“Overall, I see it as a generally useful tool in the fight against malicious traffic,” Davie said. “You can combine it with traditional anti-spam techniques to improve accuracy.”
Currently, the team is beta testing the software at a number of locations. They plan to release it as open-source software afterward.
The U.S. National Science Foundation funded part of this work, under the Software Development for Cyberinfrastructure (SDCI) program.
Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab’s e-mail address is Joab_Jackson@idg.com