[Mimedefang] redundancy in mimedefang

Dave Ellenberger dave at nofuture.ch
Mon Oct 6 07:30:12 EDT 2003


On Mon, 6 Oct 2003 12:04:50 +0200, Cor Bosman wrote
> > On my defang machine sa learns about 5'000 spams and more than 10'000 
hams a
> > day. I see no need to share the database on busy defang servers. The 
suggested
> > bayes database size of max 10MB is reached with about 5'000-10'000 
ham/spams.
> 
> Do you feel the bayes filters add significantly to the SA hitrate? 
> Im running SA as a test on a server with about 100.000 emails a day and
> im already seeing a very high correctness rate, without bayes. 

Yes, sure it does. The biggest problem are the Newsletter, which make SA 
score on many Mailer, HTML and MIME checks with false positiv. Bayes is 
great to correct this. On my server ~90% of the learned ham mails are 
written in german words and ~95% of the spam in english. This fact also 
causes bayes to be very efficient.
This leads to the following:
- To detect spam the DNS RBLs checks (I have many additional sites in my 
setup) and razor/pyzor do the most of the work.
- To detect ham bayes is the only real thing.
Sometimes it happens good mails were recieved over bad relays, without bayes 
(I score 00 with -10 points, 01 with -8 points) these mails would have 
produced a false positiv. The spam mails score 12-65 points ATM (Average ~38 
or so). Only a very few score 6-12 (subject modify is happen at 6-12 on my 
system).

> You use autolearn i assume?

Yep, I do since maybe a month. So far so good.

On another system the incoming mailserver is also outgoing mailserver. So 
mail from POP before SMTP authentificated relays can be used to execute and 
sa-learn --ham --file to make SA learn from outoing mails and skip 
spamchecks for these relays.

-Dave



More information about the MIMEDefang mailing list