[Mimedefang] SpamAssassin via mimedefang is slow

Jeff Rife mimedefang at nabs.net
Sat Nov 8 22:22:50 EST 2008


On 8 Nov 2008 at 23:52, Michiel Brandenburg wrote:

> > OK, so it turns into an O(n) algorithm, where you need to retrieve each 
> > hash you have already computed, then compare the hash of the current 
> > message against that.  After that, you add it to the database for 
> > future messages.
> Well not really you can do XOR's in the database itself meaning u can 
> tell the database to return the 1st record closest to the hash you are 
> looking for with a maximum difference of say 10 bits. This is one query 
> resulting in 1 or 0 hits.

This is an O(n) operation requiring a full table scan, because there is 
no index you can use.  The only way the database can return the 
"closest" hash is to compute the XOR of the new hash with *every* hash 
in the database.

Admittedly, with only a few thousand records, that shouldn't take very 
long, but it would also only help with a very few messages.

> See above only one record (or no record) is actually returned. But this 
> method is WAY faster .. to running SA and the rest of the heavy 
> scanners. In my work cluster 1 mimedefang per server is usually busy 
> handling about 1.5 messages per sec on average ( while the other 19 are 
> just ideling or not even spawned). My heavy network enabled scanners 
> take about 3-5 seconds per message if I did not have this type of 
> "prescanner" I would need a lot more power to handle the same amount.

I'm averaging less than 2 seconds per virus plus SA scan, so there is 
no way this would help significantly.

You need to look into methods that keep you from getting to the data 
phase at all.  Only 25% of outside connections to my servers get to the 
data phase, and half of those are from known good sender/recipient/IP 
address tuples.  I scan those anyway because my load is light, but you 
could skip them (or just run SA on a small percentage).

If you *must* accept the data before you do anything, you can still 
skip SA scanning by rejecting/tempfailing at that point.  If you are 
running spam scans on more than about 30% of your connections, that's 
way too much.

Since you are using so few records in the hash table, what you are 
probably stopping with this hash is the spam runs that send the same 
thing.  Those just don't get through for me.  As an example, here's a 
bunch that never came back after the greylist tempfail (no rcpt_to to 
protect privacy):

ip_address	mail_from
87.97.202.94	<a-akatz at abraminterstate.com>
81.132.212.227	<a.a.m.kodan at abraminterstate.com>
195.22.231.99	<a.antonio at abrahami.com>
94.75.25.181	<a.barberi at abraminterstate.com>
88.156.163.113	<a.davies31 at abrahami.com>
87.206.46.159	<a.donnell at abrahami.com>
80.93.176.70	<a1200388 at abraminterstate.com>
85.98.93.51	<a1bestgirl at abraminterstate.com>
85.98.93.51	<a2tanker at abraminterstate.com>
81.4.135.186	<a2z888 at abraminterstate.com>
67.240.219.197	<a4size at abraminterstate.com>
88.248.167.151	<a54d54 at abraminterstate.com>
85.104.244.112	<a8858 at abraminterstate.com>
87.97.202.94	<aaafivestar at abraminterstate.com>
81.213.161.144	<aaatel at abraminterstate.com>
87.97.202.94	<aablood at abraminterstate.com>
67.240.219.197	<aaczweb at abraminterstate.com>
85.98.93.51	<aaflash at abrahami.com>
89.74.78.225	<aaminov at abraminterstate.com>
87.97.202.94	<aanfm at abraminterstate.com>
92.112.99.169	<aaron.d.vickers at abraminterstate.com>
88.246.216.22	<aaron.toscano at abraminterstate.com>
85.104.244.112	<aaron216 at abraminterstate.com>
88.156.163.113	<alexia at abrahami.com>
85.98.93.51	<a_17_7 at abraminterstate.com>
81.4.135.186	<a_big_flirt at abraminterstate.com>
82.198.191.41	<a_blanken at abraminterstate.com>
88.156.163.113	<a_mirkin at abrahami.com>
81.132.212.227	<contact at abrahinge.com>
94.75.25.181	<fico at abramet.org>
88.156.163.113	<home at abrahami.com>
148.235.108.146	<info at abrahdasht.com>
195.22.231.99	<kelly at abrahami.com>
88.246.216.22	<ld at abrahamholmes.com>
88.248.167.151	<marks at abrahamandassociates.com>
67.240.219.197	<reeves at abrahamandassociates.com>
195.22.231.99	<scanersimonesimone at abramet.org>
91.139.29.133	<shields at abrahamandassociates.com>
148.235.108.146	<sims at abrahamandassociates.com>
81.4.135.186	<strong at abrahamandassociates.com>
88.226.220.44	<swelch at abrahamwatkins.com>
67.240.219.197	<vater at abrametal.com>
67.240.219.197	<vinson at abrahamandassociates.com>
87.206.46.159	<wail.mohammedsaid at abrajoman.com>


--
Jeff Rife |  
          | http://www.nabs.net/Cartoons/RhymesWithOrange/CatBed.jpg 





More information about the MIMEDefang mailing list