[Mimedefang] milter timing out.

Mon Feb 2 14:42:41 EST 2004

> 
> Jan 31 20:51:03 norm sm-mta[27815]: i112jv3O027815: Milter (mimedefang):
> timeout before data read
> Jan 31 20:51:03 norm sm-mta[27815]: i112jv3O027815: Milter (mimedefang):
> to error state

We have had the exact same problem for about a month. And, I agree that stopping sendmail and mimedefang are the only way to recover. We have more-or-less "fixed" the problem by mangling both sendmail and mimedefang configurations. I am not saying that this is either the best idea in the world, or that it is the correct fix, but it has at least caused the problem to go away.

Here is what we have done (values are for our MTAs which handle about 25K to 50K emails per day per MTA):
  1) sendmail:
     a) Set the ConnectionRateThrottle option to a low value (4) so that sendmail will not accept more than 4 connections per second. Connections are not refused, sendmail just delays responding to them. (With MyDoom, we have seen connection rate bursts > 100 new connections/sec/MTA.)
     b) Set the DelayLA option to a low value (6) so that sendmail starts delaying SMTP responses if LA > 6.
     c) Set the QueueLA option to a middle value (9) so that sendmail queues but does not deliver mail if LA > 9.
     d) Set the RefuseLA option to a higher middle value (12) so that sendmail rejects new connections if LA > 12.
     e) Set the MaxDaemonChildren option to 60 to limit the total number of daemons that can run at once.
     f) Set milter timeouts as follows: T=C:15M;S:2M;R:2M;E:15M
  2) mimedefang:
     a) Set MX_MINIMUM=5 to keep at least 5 slaves running.
     b) Set MX_BACKLOG=60 (value of MaxDaemonChildren) to allow for enough connections for each daemon.
     c) Set MX_MAXIMUM=60 (value of MaxDaemonChildren) to allow for enough slaves for each daemon. (We seemed to run out of slaves about a minute before the first timeout we received.)
     d) Set MX_BUSY=600 to limit the max length of time a slave runs a single task (milter timeouts C: and E: were set to 150% of this value).
     e) Set MX_MIN_SLAVE_DELAY=1 to keep the startup of new slaves from killing LA. 

When I get a chance to breathe, I want to try to implement multiplexor queuing and see if we can work with less slaves than we current have configured... But the way things are going now, it will be a while before I get a chance to experiment.

Hope this helps!

Jon Kibler

P.S. Will someone please let me know if what I have done is in any way stupid? All that I can say for sure is that it works -- at least for now! If someone has a better idea... I would like to know.

P.S.S. BTW, I would also like to know which milter timeout value effects the 'timeout before data read' error. Was going to post that question, but never got around to it...

--
Jon R. Kibler
Chief Technical Officer
A.S.E.T., Inc.
Charleston, SC  USA
(843) 849-8214

==================================================
Filtered by: TRUSTEM.COM's Email Filtering Service
http://www.trustem.com/
No Spam. No Viruses. Just Good Clean Email.