[Mimedefang] Training SA when mail is not stored locally?

Wed Feb 4 16:36:26 EST 2004

Hello Michael,
>  
> Is it OK to turn on bayes_auto_learn without first training 
> SA manually?
> My thinking is that the server might learn the wrong thing out of the
> gate which would be bad since it is difficult to see what is being
> tagged. Is that a misguided notion?   
> 
> If SA learns what is spam/ham based on the spam/nonspam threshold from
> messages it's read on its own (no intervention from me), don't we
> encounter a chicken and egg problem?
> 
> How do you keep the server from learning the wrong thing?  
> 
	 You can safely turn it on.  Bayes won't start working until it
reaches the minimum corpus for spam and ham separately.  It figures out
what's spam and ham by evaluating the messages against the regular SA rules
and using the auto learn thresholds you set.  Once you hit the threshold
for both the spam and ham corpus, it will start working.

For example, you set bayes_auto_learn_threshold_spam to 12 and
bayes_auto_learn_threshold_nospam to -2.

A message comes in and SA gives it a 15.  It then goes into the spam corpus
automatically as spam because of it's score.  This continues until you get
a healthy corpus (200 I think).  Then SA starts evaluating the message
using Bayes as well and it adds or subtracts from the overall score as
well.

OK?

Cheers,

Stefano