[Mimedefang] Caching the results of a SpamAssassin scan

Thu Apr 3 10:39:01 EST 2003

Quoting Joseph Brennan <brennan at columbia.edu>:

> If you have any detail it would be interesting.

Well, I didn't gather specific empirical evidence.  All I did was keep "top" 
open while I flooded the server with about ~150 messages at a time, and then 
watched the CPU and memory utilization by the mimedefang.pl processes.  With 
SpamAssassin enabled the processes would spike to sometimes 30% utilization and 
remain running for probably 10 seconds (maybe less, I'm guessing).  I then 
disabled the SpamAssassin scan and the CPU usage for the same flood didn't go 
about about 7% and the processes exited very quickly.

> I assumed this was the case, and have excluded some categories
> of mail from going to Spamassassin.
> 
>     $doSA = 0;
>     # Skip SpamAssassin sometimes...
>     #   ...like for mail from localhost (don't bounce our own bounces)
>     undef($doSA) if ($RelayAddr eq "127.0.0.1");

I'm confused about this.  I do something similar (i.e. I do not want to scan 
messages that come from trusted relays) but I determine this inside 
filter_relay.  According to the mimedefang-filter man page, $RelayAddr is not 
available in filter_end, where the SA scan takes place.  I have been checking 
for trusted relays in filter_relay and then writing a 0 byte file 
called "skip_spam_check" in the working directory, then checking for this 
file's existence in filter_end.  I'm doing this because of the warning in the 
mimedefang-filter about maintaining state between subroutines.  If $RelayAddr 
is available in filter_end (I guess I could have checked first) then I'm doing 
all of that for nothing. :)

> > Even though these were all separate SMTP sessions, each message has
> > the exact same message-ID.
> 
> But how much spam is like that?  Was this just the result of your
> test message having that message-ID in it?  It doesn't seem normal
> although some spamware might do it.

Well, let me explain how I generated my test case.  Our current mail server is 
still running alongside the new one, which I'm testing.  I created 20 bogus 
accounts on the current mail server, and 20 bogus accounts on the new mail 
server.  I set all of the accounts on the old server to forward to the 
corresponding ones on the new server.  Then I created a mailing list on the old 
server, and put the addresses of the bogus accounts on the old server in the 
list.  I then flooded that mailing list.

The old mail server saw that all of the recipients were local, so it delivered 
the mail to each one.  Then it processed the forwarding information for each 
bogus user separately, and as a result it opened a separate connection to the 
new mail server to forward the message along, even though each recipient was 
receiving the exact same message.

Yes, I know this is very contrived and not likely to occur to often in actual 
use.  But it got me thinking that perhaps like you said some spamware might try 
this in an effort to get around possible limitations on the number of 
recipients per message.  Rather than connect to a server, issue a MAIL FROM and 
then 100 RCPT TO's in succession, the spamware might run in a loop and send 
each message separately.

It's probably not too likely, but as I said, I was tired last night and 
probably not thinking straight. :)  

___________________________________________
Michael Sims
Project Analyst - Information Technology
Crye-Leike Realtors
Office: (901)758-5648  Pager: (901)769-3722
___________________________________________