[Mimedefang] Greylisting++

Chris Myers chris at by-design.net
Tue Jul 10 13:18:23 EDT 2007


>> David F. Skoll wrote:
>> So our (post-DATA) greylister takes into account the 4-tuple
>> (sender-e-mail, recipient-e-mail, sender-ip, message-subject) for
>> greylisting purposes.  Works really well!  (Alas, not patentable
>> because there is prior mention of this technique elsewhere...)
>
> John Rudd wrote:
> I've often wondered about including the message-id in the tuple, as well 
> as trying to track the same message-id coming from different senders, and 
> whether or not that's a usable spam-sign.

I've thought about various things to include in the greylisting database, 
including:

1) The standard bits: sender, recipient, relay subnet
2) Message-ID
3) Subject
4) Body (actually, a hash of the body)

Since I do post-DATA greylisting, calculating a hash of the body doesn't 
really add much load.  MIMEDefang has already decoded the MIME message at 
that point, which seems like it would be much more work than the C 
implementation of MD5/SHA.

I've been concerned about including the Message-Id, though.  I've always 
wondered if some of the big e-mail companies might calculate a Message-Id 
per transmission attempt rather than per-message, which would completely 
bugger a Message-Id based greylisting scheme.  Has anyone actually studied 
this?


I would expect that the subject and body should be immutable for the same 
message, so those seem "safe".  I suppose there's the possibility that the 
email hosters might add advertising on a per-delivery-attempt basis, but I 
haven't seen any such thing yet.

BTW, David, would it be fairly easy to add a new filter_headers() call that 
happens after the headers are decoded, but before the body is decoded?  As I 
recall there's even a milter callback documented for that.  It seems like 
that would be an ideal time to do post-DATA greylisting if you want to 
reduce the workload.

Chris Myers
Networks By Design




More information about the MIMEDefang mailing list