[Mimedefang] getting it right about how the filter works...
David F. Skoll
dfs at roaringpenguin.com
Wed Nov 28 09:39:40 EST 2001
On Wed, 28 Nov 2001, Tony Nugent wrote:
> First question: at what point is it "safe" (or unsafe) to access the
> variables declared at the top of mimedefang.pl?
The mimedefang-filter(5) man page has a section starting with the
sentence:
"In addition, the following global variables are available:"
The variables listed in that section are valid in filter_begin, filter,
and filter_end.
Any other global variables may or may not be; I make no guarantees. Read
the source to know for sure. :-)
> I'd like to be able to easily sort them chronologically by name.
> Is there any reason why a sequential timestamp mechanism not be
> used to generate these names, instead of random numbers?
That could be done, I suppose. I could base it on the time. Please
vote for this change off-list, and if the consensus is that it's a good
idea, I'll make the change.
> I'd also like to be able to use the name of the quarantine
> directory as a reference string for the "incident", making them
> easy to find if they need to be retrieved. I haven't spotted the
> global variable that identifies this generated name may be - and
> I assume that it is only available filter_end() or in filter() in
> passes _after_ an action_quarantine() has been called.
Right; the name gets generated the first time action_quarantine is called.
I can change this too.
> I am under the impession that filter_begin() is called once for each
> message filtered by mimedefang.pl. This is where things that remain
> "consistant" for a message.
Right.
> So this is where the general "global" properties of the email
> being processed (body size, recipient, sender, relay host etc) can
> be examined and acted upon right away. Here you could perhaps
> call action_discard() if message_contains_virus() returned true -
> and I assume that the message is discarded right away without
> further filter() processing.
Actually, even if you call action_discard() in filter_begin(), filter() is
still called for each part. You might (for example) want to discard a
message, but still quarantine some parts.
> Also, filter_begin() sppears to be a good place to set global
> variables based on tests done only once on its overall message
> properties. These variables can then be used as flags for
> determining actions taken in filter() or filter_end().
Be careful... see below...
> I expected so, or so my theory went. But there was some unexpected
> weirdnes that happened when I took this approach...
> I was experimenting with gathering all the Recieved: headers into
> a format that I wanted to re-parse and do things with later.
> However, I ended up with much more than I expected - other
> headers from other messages that happened to be going through the
> mail server at about the same time.
In multiplexor mode, the Perl process runs in a loop. You MUST explicitly
reset ALL of your global variables in filter_begin. Otherwise, results
for messages will accumulate, as you observed.
> Until, that is, I started setting a global $DISCARDME in
> filter_begein() that was acted on later. That global was set for
> one test message I put throughit, then for a couple of other
> messages that went through the server around the same time... all
> were discarded (luckily after being quarantined). Oops, not the
> way to do it :)
You MUST reset $DISCARDME to zero every time. Something like this:
sub filter_begin {
# Init global variables
$DISCARDME = 0;
#...
if (some_test_passses()) {
$DISCARDME = 1;
}
# ...
}
> Has this quirk got anything to do with how the multiplexor works,
> keeping processes continuously running, perhaps multiple instances
> sharing the same environment? If so, how is it possible to cope
> with this?
Just reset ALL your variables to safe or empty values first thing
in filter_begin().
> What is really going on here? Are multiple instances of the
> filter really sharing the same local environment?
Here's the pseudo-code for multiplexor-mode:
foreach incoming_mail_message {
internal_reset_global_vars();
filter_begin();
foreach part of message {
filter(part);
}
filter_end();
take_action();
}
Here, "internal_reset_global_vars" resets all the internal MIMEDefang
variables. filter_begin is responsible for initializing any global
variables you want to use. filter_end can do cleanup if necessary.
take_action() examines global variables to see what action_* methods
were called, and communicates back to Sendmail.
> Would this be a better place to, for example, do things like
> action_discard() - especially if you want to do all the filter()
> checks, set things like a global $discard variable or whatever,
> then finally check for it and act accordingly at the end.
That's one possibility.
> BTW David, you mentioned that you had a problem to solve that
> involved removing attachment, replacing them with a reference to a
> URL, then putting the attachment somewhere for web access at that
> URL. How far did you get with this?
Not done yet; should be in 2.2.
Regards,
David.
More information about the MIMEDefang
mailing list