[Mimedefang] Using DCC in SpamAssassin which is called by MimeDefang

Kelson Vibber kelson at speed.net
Thu Jun 17 18:56:36 EDT 2004


At 02:34 PM 6/16/2004, wrolf.courtney at donovandata.com wrote:
>I asked a similar question recently: who has had what experience with
>DCC/Razor/Pyzor, presumably via MIMEDefang and SpamAssassin?

All three will work with MD/SA.

Razor is probably the simplest, since SA is already running in Perl and can 
call the Razor Perl modules directly.  It also has the advantage that SA 
has different rules to handle various Razor results - if Razor gives a 
message a 50-100% probability of being spam, SA will score it higher than 
if Razor gives it a 10-50% change.  The main drawback to Razor is that it 
presently has the lowest hit rate of the three, although this should change 
soon since the next version of the client will add one of the more 
effective hashes being used by the SpamNet client (Razor's commercial 
sibling).  One trick I've found: I usually have to run "make install" 
twice, or it doesn't set up all the links in /usr/(local/)bin.

Pyzor hits more spam than Razor, but has two drawbacks: first, it runs in 
Python, and firing up a Python instance for each hit is slower than just 
calling a Perl module in an already runnng Perl.  Second, the client 
doesn't do much in the way of error recovery when it encounters a message 
it doesn't recognize.  This isn't much of a problem when called from SA - 
it just counts as if Pyzor didn't find it - but can be frustrating when you 
try to report a mailbox full of confirmed spam and it dies because the 
third message claims to use the "plain" content transfer encoding.  Be sure 
to check the Readme's section on file permissions.  I've actually seen the 
pyzor client get installed non-executable.

DCC has the highest hit rate, but that's partly because its stated goal is 
not to identify spam, but to identify bulk mail.  By definition that 
includes wanted newsletters, mailing lists, etc, although few people 
actually report mail according to that standard.  Because of this, I've 
lowered the SA score for DCC_CHECK from 2.9 to 1.  I remember having a bit 
more trouble getting it running than either Razor or Pyzor, but it's been 
long enough that I don't remember exactly what I had to do.

Several people posted some comparisons a fe months ago.  I think this was 
on the SA list.  There is certainly overlap among the three databases 
(about 60% of spam we see that trips one of them trips at least two), but 
there's enough difference that it could be worth running two or even all three.

In any case, I would recommend using the razor_timeount, pyzor_timeout, and 
dcc_timeout options in your SA config so that network slowdowns and server 
outages don't add too much time to your mail processing.

Kelson Vibber
SpeedGate Communications <www.speed.net> 




More information about the MIMEDefang mailing list