[Mimedefang] SpamAssassin 2.43

listuser at neo.pittstate.edu listuser at neo.pittstate.edu
Sun Nov 24 11:34:01 EST 2002


On Sat, 23 Nov 2002 jmiller at purifieddata.net wrote:

> I'm not sure what version you're comparing it with, but the GA scoring run
> over the spam/non-spam corpus changed significantly from the 2.3x's to
> 2.4x's, and significantly affected the scores output.
> 
> They now distribute the file Mail-SpamAssassin-2.xx/rules/STATISTICS.txt
> with each release. It has stats info on how much spam/non-spam fell from
> their corpus of mail fell into various score categories.
> 
> I asked the spamassassin people about this, and they said that the scores
> can be expected to change with every release. There's absolutely no
> guarentee that a "10" means the same thing from release to release.

This is a problem I brought up on the sa-talk list (see: consistency
between releases).  To me this is SA's the worst problem.  I first noticed
this when I went from 2.40 to 2.42.  A message that scored 13.9 in 2.40
dropped to 8.4 in 2.42.  I tried it in 2.41 and it scored 15.1.  I never
tried in in 2.43 but I imagine it followed the downhill trend.  5 in 2.43
isn't the same as in any other release.  If SA is expected to be a useful
tool from the users' standpoint, the SA folks need to pick one number like
5 to identify as 95% likely to be spam and then make their scores match
that value every time.  I can't ask my users to change their MUA filters
or LDA filters every time I update SA.  They should still run the GA each
time before a release.  It should adjust the scores appropriately with 5
(or whatever number) as its target.  Consider that a user sets up his
filters to 5 to filter most spam with 2.41 release, then had to lower it
to 4 for 2.43.  Perhaps 2.50 adjust the scores up so a 6 is the average
spam score.  Now his FP rate at 4 is unreasonably high.  I'd end up losing
that SA user because I can't offer an consistent service.  I hope the SA
guys address this soon.

I like your idea of adjusting the users' scores.  I might try that.  It
would be nice if didn't have to though.

Justin





More information about the MIMEDefang mailing list