[Mimedefang] blocked file types in text file

John Rudd john at rudd.cc
Mon Sep 19 17:11:23 EDT 2005


On Sep 19, 2005, at 13:03, <Matthew.van.Eerde at hbinc.com> wrote:

> John Rudd wrote:
>> $allowregex = '(\.gif$)|(\.jpg$)(and more expressions)';
>> $denyregex = '(\.exe$)|(\.com$)|(\.ma[dgf]$)(and more expressions)';
>>
>> But I couldn't figure out a way to get perl to tell me _which_ part of
>> the regular expression was matched (you can get it to tell you which
>> part of the _target_ string was matched, like command.com matched
>> against ".com", if you use $&, which is REALLY slow ... but it wont
>> tell you that in the regex it matched (\.com$), so that you can use it
>> as an index into a hash that contains logtxt and usertxt ... so I
>> would still have to iterate through to find out what rule was
>> tripped, so that I can then return the appropriate logtxt and usertxt
>> ... so in the end, that wouldn't be any faster than what's above.
>
> Do this - note only one set of parentheses per variable
> $allowregex = '(\.gif$|\.jpg$|and more expressions)';
> $denyregex = '(\.exe$|\.com$|\.ma[dgf]$|and more expressions)';

You don't need to do that.  What you're about to say works even if you 
use individual sets of parens.

> Then the part that matched is in $1
>
> If the file was something.mag and matched \.ma[dgf]$, $1 eq ".mag"

Right, that's the same as what $& is doing: it's telling you what part 
of the target string ("file.mag") was matched.  It is not telling you 
which part of the regular expression was matched.

And since what I have to start with is the regular expressions, what I 
have to use as a hash key is "\.ma[dgf]$".  So, after performing:

"file.mag" =~ '(\.exe$|\.com$|\.ma[dgf]$)';

there either has to be a variable which says "\.ma[dgf]$", or I have to 
iterate through my indexes to see which key matches the filename.  If 
I'm going to iterate through the keys anyway, might as well do it the 
way I'm already doing it.

Though, I suppose I could use the 2 combined regexes for:

1) if matches neither, then quickly return ("allow", "-", "-")
    [since there's no error generated for an allow, the allow
    entries tend to all be "-" for logtxt and usertxt]

2) if it mataches $allowregex, then return ("allow", "-", "-")

3) if it matches $denyregex, then iterate through the individual keys.

That's probably going to generally be faster.  I'll have to think about 
that.




More information about the MIMEDefang mailing list