[Mimedefang] Using DCC in SpamAssassin which is called by MimeDefang
Kelson Vibber
kelson at speed.net
Thu Jun 17 18:56:36 EDT 2004
At 02:34 PM 6/16/2004, wrolf.courtney at donovandata.com wrote:
>I asked a similar question recently: who has had what experience with
>DCC/Razor/Pyzor, presumably via MIMEDefang and SpamAssassin?
All three will work with MD/SA.
Razor is probably the simplest, since SA is already running in Perl and can
call the Razor Perl modules directly. It also has the advantage that SA
has different rules to handle various Razor results - if Razor gives a
message a 50-100% probability of being spam, SA will score it higher than
if Razor gives it a 10-50% change. The main drawback to Razor is that it
presently has the lowest hit rate of the three, although this should change
soon since the next version of the client will add one of the more
effective hashes being used by the SpamNet client (Razor's commercial
sibling). One trick I've found: I usually have to run "make install"
twice, or it doesn't set up all the links in /usr/(local/)bin.
Pyzor hits more spam than Razor, but has two drawbacks: first, it runs in
Python, and firing up a Python instance for each hit is slower than just
calling a Perl module in an already runnng Perl. Second, the client
doesn't do much in the way of error recovery when it encounters a message
it doesn't recognize. This isn't much of a problem when called from SA -
it just counts as if Pyzor didn't find it - but can be frustrating when you
try to report a mailbox full of confirmed spam and it dies because the
third message claims to use the "plain" content transfer encoding. Be sure
to check the Readme's section on file permissions. I've actually seen the
pyzor client get installed non-executable.
DCC has the highest hit rate, but that's partly because its stated goal is
not to identify spam, but to identify bulk mail. By definition that
includes wanted newsletters, mailing lists, etc, although few people
actually report mail according to that standard. Because of this, I've
lowered the SA score for DCC_CHECK from 2.9 to 1. I remember having a bit
more trouble getting it running than either Razor or Pyzor, but it's been
long enough that I don't remember exactly what I had to do.
Several people posted some comparisons a fe months ago. I think this was
on the SA list. There is certainly overlap among the three databases
(about 60% of spam we see that trips one of them trips at least two), but
there's enough difference that it could be worth running two or even all three.
In any case, I would recommend using the razor_timeount, pyzor_timeout, and
dcc_timeout options in your SA config so that network slowdowns and server
outages don't add too much time to your mail processing.
Kelson Vibber
SpeedGate Communications <www.speed.net>
More information about the MIMEDefang
mailing list