[Mimedefang] HTML Exploits

Fri May 4 21:48:39 EDT 2007

Daniel Aquino wrote:

> unicode or ascii... the process of reading them should be abstracted
> so that the higher level code has one set of data to read... if a html
> browser can read the js why cant we ?

The question becomes: Do you want to implement a full-scale Web
browser on your scanning machine?  Do you want to spend the CPU
cycles?  And more to the point, web browsers are complex beasts, so
any server-based reimplementation is very likely to have its own
security flaws.  (Server-side implementations are actually much
harder.  I mean, if a client executes an infinite loop in JavaScript,
it's not that big a deal.  But if a server hits such a loop while
trying to render a page...)

Honestly, if you're worried about HTML, I recommend filtering all
text/html parts through "lynx -dump" and changing the MIME type to
text/plain.

If you can put up with the deafening roars of your outraged users,
it's a great solution. :-)

(Btw, with reference to your original question: I do not recommend
Anomy::HTMLCleaner.  It's very buggy.)

Regards,

David.