[Mimedefang] Training SA when mail is not stored locally?

Joe Arnstein mimedefang at claireandjoe.com
Thu Feb 5 11:44:42 EST 2004


How would learning be affected on a machine that receives ONLY spam?
For example, our secondary server receives a steady flow of garbage all
day and night, and only gets good stuff when the primary one goes down.
If all it ever gets is garbage, how will it know what is legitimate when
it sees it?  Won't its learning curve be skewed such that it knows good
spam and bad spam?  :)

Joe



-----Original Message-----
From: mimedefang-bounces at lists.roaringpenguin.com
[mailto:mimedefang-bounces at lists.roaringpenguin.com] On Behalf Of
Stefano McGhee
Sent: Wednesday, February 04, 2004 4:36 PM
To: mimedefang at lists.roaringpenguin.com
Subject: RE: [Mimedefang] Training SA when mail is not stored locally?

Hello Michael,
>  
> Is it OK to turn on bayes_auto_learn without first training 
> SA manually?
> My thinking is that the server might learn the wrong thing out of the
> gate which would be bad since it is difficult to see what is being
> tagged. Is that a misguided notion?   
> 
> If SA learns what is spam/ham based on the spam/nonspam threshold from
> messages it's read on its own (no intervention from me), don't we
> encounter a chicken and egg problem?
> 
> How do you keep the server from learning the wrong thing?  
> 
	 You can safely turn it on.  Bayes won't start working until it
reaches the minimum corpus for spam and ham separately.  It figures out
what's spam and ham by evaluating the messages against the regular SA
rules
and using the auto learn thresholds you set.  Once you hit the threshold
for both the spam and ham corpus, it will start working.

For example, you set bayes_auto_learn_threshold_spam to 12 and
bayes_auto_learn_threshold_nospam to -2.

A message comes in and SA gives it a 15.  It then goes into the spam
corpus
automatically as spam because of it's score.  This continues until you
get
a healthy corpus (200 I think).  Then SA starts evaluating the message
using Bayes as well and it adds or subtracts from the overall score as
well.

OK?

Cheers,

Stefano

_______________________________________________
Visit http://www.mimedefang.org and http://www.canit.ca
MIMEDefang mailing list
MIMEDefang at lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang



More information about the MIMEDefang mailing list