[Mimedefang] Off Topic - Mail Message and MySQL

Phil Eschallier phil at BuxTech.Com
Wed Jun 18 13:03:01 EDT 2003


David;

I believe that I could have expressed my thoughts better.  Actually, I was
thinking that "what if the DB server is down or you hit your connect limit?"
-- you'd have a lag for the timeout period in your filter.  Then what's the
disposition of that data?  A flood of spam or virus traffic could easily
push connections up to a high level  I don't believe that the connect,
insert, then disconnect operation would be that expensive if all is running
well.

In contrast, the file system should always be there (I'll probably receive a
shot or two for that statement).  Then in post processing, if the DB is down
or there are other connection issues, the job could just retry.  You could
also control the number of DB connections by the number of threads running
inserts in your post processing.

However, I'm applying my specific concerns about data and the filter, and
combining with experience from other like efforts ... and I'm certain that
you have specific design reasons for what you've done.  It is also a
certainty that list readers with have their own specific needs.  Everyone
will solve the problem in a way that best meets their needs, I was only
voicing concerns.

... Phil



 
 

-----Original Message-----
From: mimedefang-admin at lists.roaringpenguin.com
[mailto:mimedefang-admin at lists.roaringpenguin.com] On Behalf Of David F.
Skoll
Sent: Wednesday, June 18, 2003 12:15 PM
To: mimedefang at lists.roaringpenguin.com
Subject: RE: [Mimedefang] Off Topic - Mail Message and MySQL

On Wed, 18 Jun 2003, Phil Eschallier wrote:

> I don't have hard numbers [yet], but we struggle to keep MD + SA running
in
> environments processing more than 1 milling e-mail per day.

The culprit is almost surely SA.  Our tests show the DB overhead is
minimal compared to SA scanning.  The general rule for filter-writing
is: Avoid calling SpamAssassin!  If 10% of your mail comes from a big
client or partner, and you can skip SA on mail from there, do it.

For example, in some situations, we avoid calling SA altogether (but
still do a number of DB queries).  Filter times are typically on the
order of 20-50ms (AMD Duron, 900MHz.)  As soon as SA gets involved, we
see filter times in the hundreds or even thousands of milliseconds.

(Of course, if your DB schema is poorly-tuned, it can kill you.  If you
have to do a sequential scan on a table with a million rows, well...)

Hmm.  I should do some tests with and without the DB for real numbers...

--
David.
_______________________________________________
MIMEDefang mailing list
MIMEDefang at lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang






More information about the MIMEDefang mailing list