[Mimedefang] SpamAssassin via mimedefang is slow
apex at xepa.nl
Fri Nov 7 18:53:24 EST 2008
Daniel Bourque wrote:
> hope that helps out someone else who as the same issue.
Well I solved it another way. Not because the standard MD way could not
handle it. I found that when running parallel spamassassins would for
some reason deadlock if running on the same machine. Ie .. all childs
are in busy state but not doing anything, this is probably MySQL related
(as they all use the bayes via a database, although this is not
verified). So mimedefang could lock up for this reason. These lockups
seem to be related to spam runs targeting one of the mailservers we are
running, in a redundant set of them. This machine would just lock up,
while other machines are sitting around doing nothing at all. So I
changed some things around.
I split sendmail, mimedefang, it's heavy filters and the mailstore.
While splitting the heavy filters (spamassassin in this case, testing a
way to split clamav .. not production ready though) any spam runs on one
server in my cluster will load the mimedefang located on that server but
all servers will use their spamassassin childs to handle the load.
I send my mails via the perl spamclient to a load balancer that sends it
to all the members in the cluster. I did this cause problems occur when
running too many spamd childs on one machine ( you could run 5 spamd
childs comfortably on one virtual machine with 5 virtual machines on
bare metal, but not 25 spamd childs on the same bare metal, it would
deadlock horribly, your millage may vary).
Anyhow .. in my situation my scan times are between 3-6 seconds per
message with all network scans enabled (all clients are using a compiled
set of static rules, it helps a lot), mail me offlist and I can help you
set up something similar. I run my own DNS servers helping spamassassin
Conclusion: in the right situation spamassassin can scan with network
tests enabled within 3-6 secs even with 30 mimedefang childs running on
one server (with 30 spamd clients on the cluster, btw I have a secondary
cluster that will kick in when all mailservers are under load).
Another tip: take a look at Digest::Nilsimsa (in my implementation I can
detect 60% of the spam at the data phase without restoring to heavy
scanners, like spamassassin, and temp fail it).
More information about the MIMEDefang