[Mimedefang] blocked file types in text file

John Rudd john at rudd.cc
Mon Sep 19 15:46:54 EDT 2005



Here's my code.  It's based upon the 4 field file that mailscanner uses 
(so that I wouldn't have to re-do all of our extensions).  That file is 
in the form:

allow|deny(tab)regex(tab)log-text(tab)user-text

(user-text is a user-friendly explanation for why this attachment is 
blocked; log-text is a shorter (and thus log-friendly) version of the 
same)

Here's my code:

@myfilenames = undef;

open (FILENAMERULES, "</etc/mail/filename.rules.conf");
while (defined ($line = <FILENAMERULES>)) {
    chomp $line;
    $line =~ s/\#.*//;
    $line =~ s/^\s+//;
    $line =~ s/\s+$//;
    if ($line ne "") {
       push(@myfilenames, $line);
       }
    }
close (FILENAMERULES);

# This procedure returns true for entities with bad filenames.
sub filter_bad_filename ($) {
    my ($entity) = @_;
    my ($bad_exts, $re, $perm, $regex, $logtxt, $usertxt);

    foreach $re (@myfilenames) {
       ($perm, $regex, $logtxt, $usertxt) = split(/      +/, $re);
       if (re_match($entity, $regex)) {
          if ($perm eq "allow") {
             return (0);
             }
          if ($perm eq "deny") {
             return (1);
             }
          } # if re_match
       } # foreach

    return 0;
    }

This is a simple one that doesn't actually use the user-text or 
log-text.  I also have a version I'm testing that instead of just 
returning 1 or 0, returns (perm,logtxt,usertxt), so that the usertxt 
can be incorporated into the response, and logtxt can be incorporated 
into the logs.

I also tried to smash them together into 2 strings, like this:

$allowregex = '(\.gif$)|(\.jpg$)(and more expressions)';
$denyregex = '(\.exe$)|(\.com$)|(\.ma[dgf]$)(and more expressions)';

But I couldn't figure out a way to get perl to tell me _which_ part of 
the regular expression was matched (you can get it to tell you which 
part of the _target_ string was matched, like command.com matched 
against ".com", if you use $&, which is REALLY slow ... but it wont 
tell you that in the regex it matched (\.com$), so that you can use it 
as an index into a hash that contains logtxt and usertxt ... so I would 
still have to iterate through to find out what rule was tripped, so 
that I can then return the appropriate logtxt and usertxt ... so in the 
end, that wouldn't be any faster than what's above.




More information about the MIMEDefang mailing list