[Mimedefang] Re: MIMEDefang Digest, Vol 9, Issue 37

Jeff Rife mimedefang at nabs.net
Mon Jun 21 14:05:19 EDT 2004


>     So I have this in my sa-mimedefang.cf file:
> 
> required_hits           6.0
> auto_report_threshold   20
> use_bayes               1
> bayes_auto_learn        1
> bayes_path              /var/spool/MD-Quarantine/bayes/bayes
> bayes_auto_expire                       0
> bayes_auto_learn_threshold_nonspam      0.5
> bayes_auto_learn_threshold_spam         5.5
> bayes_expiry_max_db_size                100000
> bayes_file_mode                         0644
> bayes_ignore_header                     X-Spam-Status:
> bayes_ignore_header                     X-Spam-Score:
> bayes_journal_max_size                  10240
> bayes_journal_max_size                  5120000
> bayes_learn_to_journal                  1
> bayes_min_ham_num                       50
> bayes_min_spam_num                      50
> 
> 
>     ...and yet, I don't see bayes do didly squat.  I just deleted all of 
> the bayes files, and fed it a new spam and ham content.  All the files 
> in /var/spool/MD-Quarantine/bayes/ are owned by defang.defang.  What 
> else am I missing, or have not configured properly?

First, try looking in ~defang/.spamassassin/ for the bayes_* files.  I 
wouldn't be surprised if you find them there.

This is because the "userstate_dir" is set by default to ~/.spamassassin 
in Mail::SpamAssassin, and there is no configuration file setting that 
allows you to override this.

After hammering on this for a while, I found out the following:

1. If I renamed sa-mimedefang.cf to something else ("site-prefs", in my
   case) and told MIMEDefang about it (by changing the spam_assassin_init
   call in /etc/mail/mimedefang-filter to include the filename), I found
   that I got bayes files in ~defang/.spamassassin because the use_bayes
   default is 1, but the bayes_path just didn't seem to be read
   correctly.

2. With the spam_assassin_init line referencing a non-existent file,
   /etc/mail/spamassassin/sa-mimedefang.cf gets read by the SA startup
   as a "site rule" file because of the directory it is in and the .cf
   extension.  This resulted in bayes* files in *both*
   ~defang/.spamassassin and my chosen path (/var/spool/SA-MIMEDefang/).

3. Even without mucking about with filenames, I would sometimes get
   bayes* files in both places.

This same issue also causes problems with the auto-whitelist file.

My solution (and it's a real hack) was to add the "userstate_dir" option 
to the constructor initializer list for the Mail::SpamAssassin object in 
mimedefang.pl:

$SASpamTester = Mail::SpamAssassin->new({
    local_tests_only   => $SALocalTestsOnly,
    dont_copy_prefs    => 1,
    userprefs_filename => $config,
    userstate_dir      => "/var/spool/SA-MIMEDefang"});

I now have the bayes* files and the auto-whitelist file in that 
directory, and they are being used.  My only problem is that 
"bayes_learn_to_journal" seems to be ignored.  There aren't any speed 
issues, so I don't really care to pin this down right now, since all of 
these changes have also fixed auto-learning, which didn't seem to be 
working before, either.

These changes were implemented on my test system, which does get some 
high-value SPAM, but wasn't adding them to the auto-learn database (the 
number of learned SPAMs was always zero, since I hadn't run sa-learn).  I 
rolled out a new server yesterday with these mods (for a domain that gets 
a *lot* of SPAM), and there are 152 auto-learned SPAMs in the database.


--
Jeff Rife        | "As usual, a knife-wielding maniac 
SPAM bait:       |  has shown us the way." 
AskDOJ at usdoj.gov |  
uce at ftc.gov      |         -- Bart Simpson 




More information about the MIMEDefang mailing list