[Mimedefang] Image validator/OCR SA plugin

David F. Skoll dfs at roaringpenguin.com
Wed Apr 19 20:26:26 EDT 2006


Nels Lindquist wrote:

> As far as spammers obfuscating their images, couldn't that be worked 
> around by tying OCR into the bayesian system?

I think the original idea was to obfuscate the images so people could
read the text, but OCR tools wouldn't be able to.

> Then obfuscation wouldn't matter--whatever munging is done to a
> particular image would produce the same OCR strings, before and
> after bayes training.  You wouldn't need to know particular strings
> to match beforehand in that case.

True, but you'd need to see enough of them to train your Bayes engine.

> That would force image spammers would to produce a unique obfuscated 
> graphic for every single message, which seems like an expensive 
> proposition.

Sadly, serious spammers have virtually unlimited computing resources.
There are armies of thousands of zombie machines out there waiting to
do their masters' bidding...

Adding random noise that fools OCR tools but leaves the images legible
for humans probably isn't that computationally expensive.

The only way to defeat image spam would be if Microsoft modifies
Outlook not to display HTML or images, and for Thunderbird et al to
follow suit.  Anyone care to bet on the odds of that happening? :-(

Regards,

David.



More information about the MIMEDefang mailing list