[Mimedefang] Suggestions on an HTML sanitize program.

Kevin A. McGrail kmcgrail at pccc.com
Wed Apr 29 21:37:42 EDT 2009


Michael,

Thanks to Joseph Brennan, I use this code at the end of sub filter() to 
achieve what I believe you want:

#Disable bad HTML code -- Based on work by Columbia University / Joseph 
Brennan
    #Modified by KAM 2004-04-16
    #Modified by KAM 2004-04-21 to add slurp of entire message and one 
regexp check + size check
    #Modified by KAM 2004-06-02 to add a check for defined bodyhandle and 
path to prevent issues.
    #Modified by KAM 2004-08-09 to add $io to defined variables thanks to 
Tony Nelson
    if ($type eq "text/html") {
      my($currentline, $output, $badtag, $delimiter_backup, $sizelimit, $bh, 
$path, $io);

      $badtag = 0;
      $output = "";
      $sizelimit = 1048576; #1MB #max size of an email you want to check in 
bytes
      $delimiter_backup = $/;

      $bh = $entity->bodyhandle();
      if (defined($bh)) {
        $path = $bh->path();
      }
      if (defined($path)) {
        if (-s $path <= $sizelimit) {
          if ($io = $entity->open("r")) {
            undef $/; # undef the seperator to slurp it in.
            $output = $io->getline;
            $io->close;
            $badtag = $output =~ s/<(iframe|script|object)\b/<no-$1 /igs;

            if ($badtag) {
              if ($io = $entity->open("w")) {
                $io->print($output);
                $io->close;
              }
              md_graphdefang_log('modify',"$badtag Iframe/Object/Script 
tag(s) deactivated by MIMEDefang");
              action_change_header("X-Warning", "$badtag 
Iframe/Object/Script tag(s) deactivated by MIMEDefang");
              action_rebuild();
            }
            $/ = $delimiter_backup;
          }
        }
      }
    }

I also use this rule in SA:

#WE USE MIMEDEFANG TO DISABLE ANY IFRAME, OBJECT OR SCRIPT TAGS IN EMAILS
header          KAM_IFRAME      X-Warning =~ /Iframe\/Object\/Script 
tag\(s\) deactivated by MIMEDefang/
describe        KAM_IFRAME      Email contained Iframe, Object or Script 
tags
score           KAM_IFRAME      1.0

Actually, it's good you asked this because I changed it to X-IframeWarning 
in the code.

Regards,
KAM

----- Original Message ----- 
From: "Michael D. Sofka" <sofkam at rpi.edu>
To: <mimedefang at lists.roaringpenguin.com>
Sent: Wednesday, April 29, 2009 2:10 PM
Subject: [Mimedefang] Suggestions on an HTML sanitize program.


> Greetings,
>
> Not directly a Mimedefang issue, but users of mimedefang are likely to 
> have looked into this problem.
>
> I have a need to sanitize HTML primarily to remove and scripts.  Since the 
> last time I looked at Perl modules the field of candidates has expanded. 
> I'm looking for suggestions.  The three I'm considering are:
>
> HTML::Defang
> HTML::Detoxifier
> HTML::StripScripts
>
> I am leaning towards HTML::Defang.  But, HTML::Detoxifier has a simple 
> interface, and does some HTML cleanup as well. All three appear to be good 
> choices.  My primary need is to ensure scripts are removed from the input 
> (to the degree that is possible). The application is already busy, so low 
> memory overhead, and processing speed are important.
>
> I'm interested in any feedback, or suggestions for other Perl modules.
>
> Mike
> -- 
> Michael D. Sofka               sofkam at rpi.edu
> C&MT Sr. Systems Programmer,   Email, TeX, Epistemology
> Rensselaer Polytechnic Institute, Troy, NY.  http://www.rpi.edu/~sofkam/
> _______________________________________________
> NOTE: If there is a disclaimer or other legal boilerplate in the above
> message, it is NULL AND VOID.  You may ignore it.
>
> Visit http://www.mimedefang.org and http://www.roaringpenguin.com
> MIMEDefang mailing list MIMEDefang at lists.roaringpenguin.com
> http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
> 




More information about the MIMEDefang mailing list