[Mimedefang] Image blocking idea
David F. Skoll
dfs at roaringpenguin.com
Fri Apr 21 20:30:04 EDT 2006
Martin Blapp wrote:
> I already log possible text (I count alphanummeric chars in the ocr output)
I think it would be interesting to add a new text/plain part to the e-mail
consisting of the OCR'd text, and feed that into Bayes. Even if OCR gets
some words wrong, I bet the same mis-spelled tokens would quickly rise
to the top of the "spammy" token list.
We did some tests along these lines, and as a side-benefit, we discovered
some SARE stock-scam tests firing on the OCR output.
More information about the MIMEDefang