[Mimedefang] Revisit: Filtering on HELO

Joseph Brennan brennan at columbia.edu
Fri Mar 16 09:39:34 EDT 2007

Dirk the Daring <dirk at psicorps.org> wrote:

>     I find HELO-filtering very effective in stopping spammers before they
> get to waste my resources. After all, why bother with RBLs, Clam and/or
> SpamAssassin if the spammer is stoopid enuf to tip their hand at HELO? At
> the same time, I don't want to create a situation where my filter has a
> great risk of false-positives

I've been logging helo strings for a few weeks.  Requiring a valid
helo will definitely get a significant minority of false positives.
Bad helo correlates to spam, but only well enough to justify scoring
for it a la Spamassassin.  Remember that you can have Mimedefang run
up a $score variable of its own, and add it to what is returned by
Spamassassin, if you want to score for some things more easily tested
by Mimedefang itself.

Names that cannot possibly be FQDN, like names with no dot, correlate
well to spam.  Even so, some are legitimate systems run by small
organizations that probably don't have an email or network specialist
to tell them what to do.  Examples of bogus helos:

MDOM19522-2.postech.ac.kr said friend
p85.212.152.207.tisdip.tiscali.de said vesmxs
238.subnet125-164-166.speedy.telkom.net.id said b3-802190dd2394
ip67-90-48-134.z48-90-67.customer.algx.net said recording_room said vszrzdqy

maybe legit:
web2.dietwatch.com said web2
iiis-conferences.org said win2003-06

The above wwre caught within ten minutes.  Notice "helo friend" in
particular is a 100% match on spam; 630 of those yesterday.

Names that look like they might be FQDN are amazingly often not the
actual FQDN of the host sending the mail.  We had 123,254 examples
in 1.8 million connections.  Examples:

ply-222-16.lycos-newsmail2.com said lycos-newsmail2.com
permemail06.alumniconnections.com said permemail01.alumniconnections.com
arm241.bigfootinteractive.com said bigfootinteractive.com
out006.sctm.tfbnw.net said facebook.com
humbolt.leper.phil.uu.nl said humbolt.nl.linux.org
3048.ip.BroadSt.FCC.NET said qmail.fcc.net
amailer8.forbesdigital.com said amailer6.forbesdigital.com
camppool10.emailebay.com said sjcitemap04ext.sjc.ebay.com
mx1.phx.paypal.com said phx01imail03.phx.paypal.com
servera01.tk2smtp.msn.com said servera01.tk2smtp4.msn.com

Almost all of these are for legitimate mail.  In some cases the helo
string is marginally right.  For example although
points only to humbolt.leper.phil.uu.nl, both that name and the helo
string humbolt.nl.linux.org have A records to the IP.  But checking
this costs another DNS lookup.  In many cases the last two or three
parts of the domain match anyway.

Suffice to say, adherence to what "SHOULD" be in helo is sadly lacking
even in some major corporate players.

Joseph Brennan
Lead Email Systems Engineer
Columbia University Information Technology

More information about the MIMEDefang mailing list