[Mimedefang] HTML Mail / Active content filter

Florian Lohoff f at zz.de
Mon Apr 10 05:32:46 EDT 2023

i'd like to drop/replace HTML attachments/mails which contain active
components like javascript/javascript external refs.

	<script language="javascript></script>


		<script type="text/javascript" src="http://a.b.c.d"></script>

Basically going through all text/html etc parts. I am unshure whether
i'd need to really decode HTML with HTML::Parse or the like to find it
or if simple "regex" matching would be sufficient. Currently i am 
dropping this by spamassassin with custom filters using regex.

Has anyone an example for this or experience which HTML perl module
is the most stable?

And while at it. I tried my luck to do this also with PDF with active
content, trying to parse PDF with CAM::PDF (or PDF::API2) to drop
PDFs with active content. So if anyone has suggestions here would
also be nice.

Florian Lohoff                                                     f at zz.de
  Any sufficiently advanced technology is indistinguishable from magic.
