[Mimedefang] MIMEDefang/Bogofilter

David F. Skoll dfs at roaringpenguin.com
Tue Feb 4 23:13:00 EST 2003


On Tue, 4 Feb 2003, Michael Sofka wrote:

> 	Simple linear weighting methods out-perform ``fancy'' methods
> 	such as Genetic Algorithms, Neuro-Nets, Kohonen nets, etc.
> 	(An MIT undergrad found he could crunch better SA weights
> 	much faster using a LINPAC routine in place of SA's GA.)

Really?  I'd love to see a reference for that.  How would you define
an objective function for the optimization problem?  Or maybe I'm
missing something?

> 	No corpus of spam is large enough for training/tuning
> 	detectors.  There is always another word, phrase, or way
> 	of conveying an idea which will evade a detector.

That's true.  I believe a combination of SpamAssassin rules (so that
you immediately have something that works pretty well out of the box)
plus some statistical "learning" is about the best pure filtering you
can do.  SA 2.50 includes some statistical methods, I believe.

Also, there are some characteristics of spam that have nothing to do
with message content, but rather transmission methods, as I mentioned
in http://lists.roaringpenguin.com/pipermail/mimedefang/2003-January/004081.html

--
David.



More information about the MIMEDefang mailing list