[Mimedefang] Partial Word Subject Line Blocking

M Jerome Garrett jgarrett at techsolutions.cc
Fri Jul 28 23:43:34 EDT 2006


I have an issue that I have been trying to resolve.  Currently I block
certain words on my subject lines and I block certain phrases.  For
instance, I block save! and I block tired.of (the period being a space) 
What I am trying to do is block parts of words, for instance I get a lot of
emails with subject lines that say “Re: suronRjOLEX”  I can block that word
but the next day I’ll get an email with  the subject suromRjOLEX (notice
that there is an “m” instead of an “n”)  I would like to block an email if
the characters in the subject resemble “RjOLEX” even though there is
something before it.  

Just to clarify IF I wanted to have all instances of the words “science” to
be removed I would also like “conscience” to be removed as well.  

Currently my routine for checking for certain words in the subject is done
by mimedefang and it looks at a database I have set up.  Here is the
routine.

If I was better at perl I could probably take care of this myself. 

Can anybody offer some help?

$DBFilenameSUBS = "/etc/mail/subjects.db";
sub lookup_subject() {
    # convert incoming subject to lower-case
    my $lc_subject = lc($Subject);
    my $subject_result = 0;

    my %GDB;
    if (tie(%GDB,'DB_File', $DBFilenameSUBS, O_RDONLY)) {
        # remove white space from the middle so that
        # "free s t    u f f here" becomes "free s t u f f here"
        $lc_subject =~ s/(\s)\s+/$1/g;
        # next 2 lines collapse "free  s t u f f  here" into "free stuff
here"
        $lc_subject =~ s!((^|\s)\S\s(\S(\s|$)){2,})!
            my $lc_subject_x=$1;$lc_subject_x=~s/\s//g;sprintf
"%s","$lc_subject_x ";!ego;
        $lc_subject =~ s/^\s+//;  # Trim leading whitespace
        $lc_subject =~ s/\s+$//;  # Trim trailing whitespace
        $lc_subject =~ s/^re://;  # Trim leading "re:"
        $lc_subject =~ s/^fw://;  # Trim leading "fw:"
        $lc_subject =~ s/^fwd://; # Trim leading "fwd:"
        $lc_subject =~ s/\s+/./g; # Collapse whitespace into periods

        # Scan database for a complete match (only)
        if ($GDB{$lc_subject}) {
            $subject_result = 1;
            md_graphdefang_log("Subject_Line", "Subject-line found in
subjects.db");
        } else {
            # See if any one word in the subject appears as a record
            @subject_array = split (/\./, $lc_subject);
            foreach $subject_word (@subject_array)
            {
                if ($GDB{$subject_word}) {
                    $subject_result = 1;
                    md_graphdefang_log("Subject_Word",
                        "Subject-word \"$subject_word\" found in
subjects.db");
                    last;
                }
            }
        }
        if (!$subject_result)
        {
            # here we reverse the logic... see if any record in the database
            # is found as a substring in the subject.  if a record contains
            # "free.stuff" and the subject says "get your free stuff here",
            # then flag it as a hit.
            my $subject_record;
            foreach $subject_record (keys %GDB)
            {
                if ($lc_subject =~ m/(^|\.)\Q$subject_record\E($|\.)/)
                {
                    $subject_result = 1;
                    md_graphdefang_log("Subject_Substring",
                        "Subject-substring \"$subject_record\" found in
subject line");
                    last;
                }
            }
        }
        untie %GDB;
    } else {
        md_syslog('warning', "subject: Cannot open file $DBFilenameSUBS");  

M. Jerome Garrett
Technology Solutions, President
M.S. Telecommunications Management






More information about the MIMEDefang mailing list