[Mimedefang] Problems with per-user SQL prefs with SpamAssassin

Hugh Messenger hugh.messenger at gmail.com
Wed Mar 18 15:49:06 EDT 2009


This may be a SpamAssassin issue, but I thought I'd start here.  Been
banging my head on this one for two days.

SpamAssassin 3.2.5
MimeDefang 2.65

I had a working setup with MD using load_scoreonly_sql() to load
per-user prefs from a backend MySQL database.  But at some point in
the last couple of weeks (while I wasn't looking), without me changing
anything (at least as far as anything directly related to MD/SA is
concerned - I have done a few yum updates of "other stuff"), it seems
to have stopped working.  MD/SA is still working as far as all the
other rules and plugins are concerned.

Here's a sample session from my logs.  I've added some extra dbg()
output lines in SA's SQL.pm to check the query it is executing, and to
ensure that it is picking up prefs correctly:

============================ BEGIN log samples
Mar 18 13:38:29 pmx3 mimedefang.pl[12909]: ++++++++++++About to
load_scoreonly_sql(hugh)

[NOTE - the first of the extra dbg() entries I added.  I've run this
query by hand, and it returns exactly what it should]
Mar 18 13:38:29 pmx3 mimedefang.pl[12909]: config: Conf::SQL:
executing SQL: select preference, value from userpref where username =
'hugh' or username = '@GLOBAL' order by username asc

Mar 18 13:38:29 pmx3 mimedefang.pl[12909]: config: retrieving prefs
for hugh from SQL server

[NOTE - in this log msg I added, I'm replacing newlines with ;'s in
the prefs list, to make it printable in the log]
Mar 18 13:38:29 pmx3 mimedefang.pl[12909]: config: got prefs text for
hugh: whitelist_from <support at alaweb.com>;whitelist_from
<opttag at optout.optout>;blacklist_from
<root at pmx55.alaweb.com>;blacklist_from
<sender at example.net>;whitelist_from
<hugh.messenger at gmail.com>;whitelist_from <root at ns1.alaweb.com>;

Mar 18 13:38:29 pmx3 mimedefang.pl[12909]:
MDLOG,n2IIcOW2013063,mail_in,0,,<root at ns1.alaweb.com>,<hugh at alaweb.com>,test106
============================ END log samples

As you can see from the last line, the score is 0.  And in the msg I receive:

X-Spam-Status: No, hits=0.0 required=5.0
	tests=none
	version=3.2.5

Yet <root at ns1.alaweb.com> is clearly being seen as a whitelist_from in
load_scoreonly_sql, so should generate a USER_IN_WHITELIST.  The same
thing happens with any recipient, for any prefs.  All other SA rules
fire ... but not my SQL prefs.

Here are the relevant parts of my mimedefang-filter.  One interesting
thing to note is that my optout processing is working, which relies on
...

$sa_object->{conf}->{whitelist_from}->{'<optout at optout.optout>'}

... being set (I add this on the backend from our user provisioning
pages).  So I *know* SA is picking up my prefs!!!

============================ BEGIN mimedefang-filter excerpts
#***********************************************************************
# AlaWeb stuff
#***********************************************************************

# NOTES:
#
# /etc/init.d/mimedefang
# Need to enable MX_RECIPIENT_CHECK
# Set MX_MINIMUM and MX_MAXIMUM (trying 4 and 20)
# Enable MX_EMBED_PERL
#
# /etc/mail/sa-mimedefang.cf
# Need to set the MySQL details for the userprefs
# user_scores_dsn DBI:mysql:spamassassin:localhost
# user_scores_sql_username spamassassin
# user_scores_sql_password xxxxxxxxxxx

# HighReq is the top cutoff for the grey range, i.e. above this we reject
$HighReq = 8;

# Anything hitting above DiscardReq we discard (no NDR)
$DiscardReq = 16;

# DO NOT UNCOMMENT THIS LINE ON LIVE BOXES!!!!!!!!!
# REALLY!!!!!!!!  DON'T DO IT!!!!!!!!
# If set, removes all real recips and sends EVERYTHING to this addr
#$DropTestAddr = 'spamtest at alaweb.com';

# DO NOT COMMENT OR CHANGE THIS LINE ON LIVE BOXES!!!!!
# If 0 or undefined, will not actually reject spam.
$DropHighSpam = 1;

# ONLY UNCOMMENT THIS LINE IF YOU WANT A BUNCH OF EXTRA SYSLOG FROM SA!!!!
# arg is a comma separated list of SA facilities
$DebugSA = 'config'; # enabled SA logging to syslog

# Set to 0 to disable SQL prefs checking
$UseSQL = 1;

# We will only have to change this line if we change the mailertable hostname
# for our MailSite box.
$DoSQLHost = 'elk1.alaweb.com';

# A warning msg we attach as an inline attachment to greys
$AlaWebWarning = "AlaWeb's spam detection software has identified this
email as possible spam.\n";

$SALocalTestsOnly = 0;

[... lots of standard filter stuff snipped ...]

my $sa_object;
my %cleanconfig = ();
sub filter_initialize {
    if ($Features{"SpamAssassin"}) {
        # $$$ hugh - grab the SA object ...
        $sa_object = spam_assassin_init() if defined(spam_assassin_init());

        # ... and compile the rules
        $sa_object->compile_now();

        # Grab a copy of the clean config to restore to when switching users
        if (!%cleanconfig) { $sa_object->copy_config(undef,\%cleanconfig); }

        # enable SA syslog if required
        if ($DebugSA) {
                use Mail::SpamAssassin::Logger;
                Mail::SpamAssassin::Logger::add(method => 'syslog',
socket => 'unix', facility => 'mail');
                Mail::SpamAssassin::Logger::add_facilities($DebugSA);
        }
    }
}

[... standard filter stuff snipped ...]

sub filter_end {
    my($entity) = @_;

    # If you want quarantine reports, uncomment next line
    send_quarantine_notifications();

    # IMPORTANT NOTE:  YOU MUST CALL send_quarantine_notifications() AFTER
    # ANY PARTS HAVE BEEN QUARANTINED.  SO IF YOU MODIFY THIS FILTER TO
    # QUARANTINE SPAM, REWORK THE LOGIC TO CALL send_quarantine_notifications()
    # AT THE END!!!

    # No sense doing any extra work
    return if message_rejected();

    # Spam checks if SpamAssassin is installed
    my($myhits) = 0;
    if ($Features{"SpamAssassin"}) {
        if (-s "./INPUTMSG" < 100*1024) {
            # Only scan messages smaller than 100kB.  Larger messages
            # are extremely unlikely to be spam, and SpamAssassin is
            # dreadfully slow on very large messages.

            # $$$ hugh - first thing we need to do is reload the
            # cleanconfig we saved in filter_initialize(), to clear
            # out the last users prefs (we're running with embedded perl,
            # so all that crap is persistant).
            $sa_object->copy_config(\%cleanconfig,undef);

            # $$$ hugh - we only want to consult sql for userprefs if the
            # recipient is hosted on our own mailbox server.  Easiest way
            # to check for this is by looking at the RecipientMailers hash.
            # RecipientMailers contains the mailer/host/addr triple from
            # /etc/mail/mailertable for each recip addr.  As we're running
            # as stream_by_recipient, we only ever have one recip, which
            # will be in $Recipients[0].  $DoSQLHost is define at the top
            # of this script.
            if ($UseSQL && ${RecipientMailers{$Recipients[0]}}[1] =~
/\[$DoSQLHost\]/) {
                # $$$ hugh - boil the recip address down to a plain username
                my $recip = $Recipients[0];
                $recip =~ s/@.*//;
                $recip =~ s/<//;

                md_syslog('info',"++++++++++++About to
load_scoreonly_sql($recip)");
                # $$$ hugh - load this users prefs from MySQL
                if (!($sa_object->load_scoreonly_sql($recip))) {

md_syslog('info',"++++++++++++load_scoreonly_sql failed for: $recip");
                }

                # $$$ hugh - cheap way of allowing user optout, backend will
                # add a whitelist_from entry for <optout at optout.optout> in
                # userprefs table for this user if they optout.  So we don't
                # have to do a separate db lookup, just grab it from SA's
                # whitelist_from config, which will now have the userprefs
                # from the load_scoreonly_sql() call above.
                if (defined
($sa_object->{conf}->{whitelist_from}->{'<optout at optout.optout>'})) {
                    # $$$ hugh - user has opted out, so log it, slap a status
                    # header on, and we're done.
                    md_syslog('info',"user is opted out: $recip");
                    md_graphdefang_log('mail_in',0,$RelayHost);
                    action_change_header("X-Spam-Status", "user has
opted-out of spam filtering");
                    return;
                }
            }

            my($hits, $req, $names, $report) = spam_assassin_check();
            $myhits = $hits;

            # $$$ hugh - build an X-Spam-Status status line like SA would.
            # http://www.pccc.com/downloads/MIMEDefang/mimedefang-filter-KAM
            # X-Spam-Status: No, hits=1.4 required=7.0
            #
tests=AWL,FROM_HAS_MIXED_NUMS,HTML_40_50,MAILTO_TO_SPAM_ADDR,MISSING_OUTLOOK_NAME
            #   version=2.55
            action_change_header("X-Spam-Status",
&build_status_line($hits, $req, $names, $report));

            # $$$ hugh - did SA score it as spam?
            if ($hits >= $req) {

                # $$$ hugh - if hits are more than our configured high value,
                # then we reject it.
                if ($hits > $HighReq) {
                    md_graphdefang_log('spam', $hits, $RelayAddr);

                    # $$$ hugh - DropHighSpam is a test setting, should
                    # always be enabled (top of this file) on live boxes
                    if ($DropHighSpam) {
                        if ($hits > $DiscardReq) {
                            action_discard();
                        }
                        else {
                            action_bounce('Message Rejected.');
                        }
                        return;
                    }
                }

                # $$$ hugh - if got this far, then it 'grey', so we
                # tag the Subject
                # @TODO - add in option checking to see if they want it
                # tagged or not

                # Create *** string for tag ... so if threshhold (req) is
                # 5 and hits was 7.2, give it 3 *'s.
                $score = "*" x int(($hits+1) - $req);

                action_change_header("X-Spam-Score", "$hits ($score) $names");
                action_change_header('Subject', "[SPAM$score] $Subject");
                md_graphdefang_log('probable_spam', $hits, $RelayAddr);

                action_add_part($entity, "text/plain", "-suggest",
$AlaWebWarning, "warning.txt", "inline");

                # $$$ hugh - ONLY USED IN TESTING!!!!!
                # removes all recips, and sends everything to
spamtest at alaweb.com
                if ($DropTestAddr) {
                    action_add_header("X-Orig-Rcpts", join(", ", @Recipients));
                    foreach $recip (@Recipients) { delete_recipient($recip); }
                    add_recipient($DropTestAddr);
                }

                # $$$ hugh - done!
                return;

            } else {
                # Delete any existing X-Spam-Score header?
                action_delete_header("X-Spam-Score");
            }
        }
    }

    md_graphdefang_log('mail_in',$myhits);

    # Deal with malformed MIME.
    # Some viruses produce malformed MIME messages that are misinterpreted
    # by mail clients.  They also might slip under the radar of MIMEDefang.
    # If you are worried about this, you should canonicalize all
    # e-mail by uncommenting the action_rebuild() line.  This will
    # force _all_ messages to be reconstructed as valid MIME.  It will
    # increase the load on your server, and might break messages produced
    # by marginal software.  Your call.

    # action_rebuild();

    # $$$ hugh - ONLY USED FOR TESTING!!!!
    if ($DropTestAddr) {
        action_add_header("X-Orig-Rcpts", join(", ", @Recipients));
        foreach $recip (@Recipients) { delete_recipient($recip); }
        add_recipient($DropTestAddr);
    }
}
============================ BEGIN mimedefang-filter excerpts

So I'm utterly stumped.  Any ideas, suggestions or wild theories
greatly appreciated.

   -- hugh



More information about the MIMEDefang mailing list