[Mimedefang] Suggestions on an HTML sanitize program.
Michael D. Sofka
sofkam at rpi.edu
Thu Apr 30 14:14:33 EDT 2009
Kevin A. McGrail wrote:
> Michael,
>
> Thanks to Joseph Brennan, I use this code at the end of sub filter() to
> achieve what I believe you want:
Thank you. I guess I wasn't clear enough that the application is not
Mimedefang. However:
> $badtag = $output =~ s/<(iframe|script|object)\b/<no-$1 /igs;
Would fix 90% of the problem. It still leave other sources of scripts,
such as on the "onload" attribute in an image. It will also miss
scripts hidden by character encodings. In the interests of having
something that is quick and simple, however, I may do exactly the above.
On the other hand, once I'm ready to add that line of code, I may as
well type, for example:
my $stripped_html = detoxify($html, disallow => [qw(dynamic)]);
and get an that extra 9.9%, assuming the overhead of parsing the HTML
isn't too high, and HTML::Detoxify (or HTML::Defang (or
HTML::StripScripts)) really does what is claimed, and is updated as new
exploits are discovered.
In the PHP world, there seems to be a new way to slip a script past the
standard libraries discovered each week. But, at least the patches keep
coming. (Not to pick on PHP, the problem is in HTML and the difficulty
of actually detecting when a script is present. PHP does patch the
problems as they are discovered.)
Mike
--
Michael D. Sofka sofkam at rpi.edu
C&MT Sr. Systems Programmer, Email, TeX, Epistemology
Rensselaer Polytechnic Institute, Troy, NY. http://www.rpi.edu/~sofkam/
More information about the MIMEDefang
mailing list