[Mimedefang] spamassassin setup. (slightly OT)

Fri Aug 1 14:47:01 EDT 2003

Sydney Weidman wrote:
> I remember reading in the SA docs that the learning was ineffective and
> could possibly produce bad results if it was done on small numbers of
> mail messages. I also seem to remember that (by default?) running
> sa-learn automatically unlearns the previously collected data.

Not sure which doc you were reading, but I think you've misinterpreted
something.

The SA Bayes setup needs at least 200 each spam and ham to have any
effect on the score.  Below that point, it doesn't really have enough
data/tokens to give any real benefit.

Learning one or two messages at a time, or several hundred, makes little
difference except in speed and (potentially) database availability for
SA to scan new incoming messages.

Unless you explicitly run sa-learn --forget, messages learned 5 minutes
ago are not "unlearned".  However, there *is* an expiry system that
periodically expires Bayes tokens that haven't been seen in a while, or
which were first seen some set amount of time/number of
messages/[something] ago.  (I'm not certain what the measurement is
there.)  That expiry can be controlled to a limited degree;  see the SA
docs.

-kgd
-- 
<erno> hm. I've lost a machine.. literally _lost_. it responds to
ping, it works completely, I just can't figure out where in my
apartment it is.