[Mimedefang] Re: Per-recipient customization (David Eisner)

Wed Mar 29 11:04:01 EST 2006

>Background: I have MIMEDefang running (with multiplexor and embedded
>Perl interpreter) on a Linux box that serves, by way of a mailertable,
>as an MX for another server.  This second server runs Netware,  hosts
>the actual mailboxes, and provides pop service.

   Since you mention NetWare, is it a good idea to assume that the mail
environment is GroupWise? Modern GroupWise serves POP3 (as well as IMAP,
and SSL flavors of both, but POP3 is only at the GWIA), and as it
happens, I have a very similar arrangement at one place where I set up
the mailsystem (SLES 9 with sendmail v8.13 and MD 2.52 fronting
GroupWise v7).

>Whether or not you think it's a good idea, I need to give users the
>option of discarding messages with SpamAssassin scores above a certain
>threshold.  Such messages should be discarded at the server running
>MIMEDefang, before they reach the user's MUA.  Right now we tag them
>with the X-Spam-Score header in the usual way.

   If your users understand that pros and cons (and preferably have
signed a written statement to that effect), I don't see an issue. I'll
note in passing that, if you mailsystem is GroupWise, then user Rules
that would discard based on the "X-Spam-Score" header are possible, and
would take effect upon initial delivery of the E-Mail to their mailbox.
Thus an appropriate Rule would discard (Trash) the E-Mail before the
user ever saw it.

>One approach I'm considering is a configuration file on the MX that
>would be a table of  envelope recipient addresses (or recipient regex's)
>and score thresholds.  If a recipient matches an address (or regex, if I
>go that route) in the config file, the threshold is compared to the
>reported score to determine whether to discard the message (for that
>user -- see below) or just tag it as we do now.
>
>What would be a good way to approach this?  The first thing that comes
>to mind is putting the file in /etc/mail/ (say
>/etc/mail/mimedefang_user_prefs.cf), read it in filter_initialize(), and
>then use it in filter_end() somewhere after spam_assassin_check().
>
>Another issue: how to handle one message with multiple envelope
>recipients, not all of whom have the same policy. I need to RTFM and
>think about this a bit more, but would it be relatively easy to strip
>some recipients in filter_end(), and leave others, before the message is
>forwarded through the mailertable mechanism?

  I wrote code in my MIMEDefang filter to do per-recipient rejection
based on the SA score. I'm not a Perl programmer by trade, and doubtless
my code will generate some giggles from anyone who sees just how
brute-force it is, but it does seem to work well.

  The way it is presently coded, a purpose-written function checks a
static array of E-Mail addresses. One such check is made for each
Recipient listed in the SMTP envelope (@Recipients, not the "To:"
header). If the SA score for the E-Mail is over the limit defined for
the Recipient, then the Recipient is removed from the @Recipients array.
If, at the end of the process, there are no Recipients left, the entire
E-Mail is DISCARDed.

   Here is my code in filter_end, just after the call to SA:

--- Cut here ---
[...]
# Enter SPAM limit checking code only if
#   SPAM score is at least as high as the LOWEST
#   limit in our list
if ( $hits >= $limitspamlow )
	{
	# Cycle thru list of addresses for which a
	#   specific SPAM score limit has been defined
	#   and delete the recipient from the E-Mail
	#   when the score exceeds the defined limit
	#   for that address
	for ( $repindex=0 ; $repindex < @Recipients ; $repindex++ )
		{
		# Initialize SPAM limit variable to insure that every
		#   loop uses the proper number
		$spamlimit=0;

		$spamlimit = _check_limit($Recipients[$repindex]);
		# If we found a limit, act on it
		if ( $spamlimit )
			{
			# If SPAM score exceeds limit,
			#   bounce the message for that
			#   recipient, log, and decrement
			#   counter tracking total # of recipients
			if ( $hits >= $spamlimit )
				{
				md_syslog('alert', "[$MsgID] filter_end: deleted $Recipients[$repindex] - SPAM score $hits exceeded limit $spamlimit");
				delete_recipient($Recipients[$repindex]);
				$numrcpts--;
				}
			else
				{
				# md_syslog('alert', "[$MsgID] filter_end: permitted an E-Mail to $Recipients[$repindex] because the SPAM score of $hits did not exceed $spamlimit");
				}
			}
		else
			{
			md_syslog('info', "[$MsgID] filter_end: no SPAM score limit for $Recipients[$repindex]");
			}

		} # End of FOR loop
	}
else
	{
	# md_syslog('info', "[$MsgID] filter_end: SPAM score $hits below lowest limit of $limitspamlow");
	}

# Check to see if anyone is left to receive this
#   E-Mail; if not, just bounce it
if ( $numrcpts < 1 )
	{
	# Vary log message based on original
	#  number of recipients
	if ( @Recipients > 1 )
		{
		md_syslog('warning', "[$MsgID] filter_end: discard from $Sender to $totalrcpts recipients - SPAM score $hits exceeds all limits");
		}
	else
		{
		md_syslog('warning', "[$MsgID] filter_end: discard from $Sender to $Recipients[0] - SPAM score $hits over limit $spamlimit");
		}
	action_bounce("SPAM Rejected");
	return action_discard();
	}
else
	{
	if ( $numrcpts < $totalrcpts )
		{
		md_syslog('info', "[$MsgID] filter_end: permit from $Sender with SPAM score $hits to $numrcpts of $totalrcpts addresses");
		}
	}
[....]
--- Cut here ---

   And here is the _check_limit function:

--- Cut here ---
[...]
#***********************************************************************
# %PROCEDURE: _check_limit
# %ARGUMENTS:
#  checkaddr -- the address to check against the list of SPAM limits
# %RETURNS:
#  An integer indicating the SPAM score limit for the address; or
#       zero (0) if the address has no score associated with it
# %DESCRIPTION:
#  Called by filter_end, but only for external hosts
#***********************************************************************
sub check_limit ($)
        {
        my($checkaddr) = @_;

        # Local indexing variable
        my($indexaddr);

        # Return value
        my($maxspam);

        # Initialize return variable
        $maxspam=0;

        # Make sure the address we're checking is in all lower-case
        $checkaddr = lc($checkaddr);
        # Remove any angle-brackets from address
        $checkaddr =~ s/^<//;
        $checkaddr =~ s/>$//;

        # IDEA: Following code can be used to make
        #   this check sequence independent of Domain Name
        # Split the Recipient into Address and Domain Name
        # @checkarray=split(/\@/, $checkaddr);
        # Extract just the Address
        # $checkaddr=$checkarray[0];

        # md_syslog('alert', "[$MsgID] _check_limit checked $checkaddr for a SPAM score limit");

        # Is the address in list of addresses for which an individual
        #   SPAM limit is defined?
        for ( $indexaddr=0 ; $indexaddr < @limitspamusers ; $indexaddr++ )
                {
                if ( $checkaddr eq $limitspamusers[$indexaddr] )
                        {
                        $maxspam=$limitspam[$indexaddr];
                        md_syslog('info', "[$MsgID] check_limit: SPAM score limit $maxspam for $checkaddr");
                        last;
                        }
                else
                        {
                        # md_syslog('info', "[$MsgID] check_limit: determined that $checkaddr did not match $limitspamusers[$indexaddr]");
                        }
                }
        return($maxspam);
        }
### End of _check_limit

[....]
--- Cut here ---

   And here is the really brute-force part - I use TWO arrays because
I'm not Perl-savvy enuf to do this in a multi-dimensional array like is
possible. So the arrays on which my code depends are:

--- Cut here ---
[...]
###############################
# Declare an array of recipients and an array of corresponding
#   integers - the numbers are SPAM scores at or above which the
#   E-Mail should be discarded
###############################
@limitspamusers=qw(
webmaster at somedomain.tld
userx at somedomain.tld
sales at somedomain.tld
root at somedomain.tld );

@limitspam=qw( 3 5 10 2 );

# This variable controls the lower boundary of SPAM scores before the
#   per-address checking code is triggered to see if addresses should
#   be removed from an E-Mail; if the SPAM score is under this limit
#   the _check_limit code is not run; this limit should match the LOWEST
#   number in the @limitspam array
$limitspamlow=2;

[...]
--- Cut here ---

   Hopefully, this shows you enuf of how to do it. Or you could, as
David Skoll suggests, gets Can-it! Pro, which is probably a lot
better-written (and almost certainly easier to manage for a large number
of Recipients) than my code is.

Dirk

P.S. If there's a minimum of laughter at my Perl coding skills, I'll
probably make my code sample into a Wiki page. And I won't interpret
anyone showing me how to do this in a multi-dimensional array as
laughing. :-)