[Mimedefang] Busy Timeout Slaves

William Parr wparr at uvic.ca
Wed Mar 31 14:26:37 EDT 2010


Hello,

  Have an issue with mimedefang slave "busy timeouts" a few times a week on all of my edge/outbound servers. The vast majority of times, the number of slaves in use is 10 to 20 at any time. When the problem arises, the slaves jump up to the max of 200 and 5 minutes later the busy slaves get terminitated and all mail becomes deferred. A restart of sendmail fixes the problem.

 I have added debug statements in our mimedefang-filter to try and determine where the problem. Every time this issue occurs, it seems to hang in the same spot for all email during that instance, but the next time the issue occurs, it will hang in a different spot ie: The example below seems to indicate it is hanging in  filter_begin, but I have seen it hang in filter-recipients, and filter and sometimes after  "milter=mimedefang, action=body, continue" and before  filter_begin

Utilizing a RAM disk for mimedefang. No system load, cpu or memory issues that would explain the "busy timeouts".

Any help you can give me in solving this issue would be most appreciated. Thanks in advance

Interesting enough a AIX v5.2 server doing the same function doesn't have this problem

Details: 

1,000,000+ emails processed a day

/usr/local/bin/mimedefang-multiplexor -m 30 -x 70 -q 100 -w 1 -b 300 -c 30 -s /var/mimedefang/mimedefangm.sock -l -T -L 60 -M 200000\

Edge/Outbound Servers

Redhat                        4 - 32 bit  
Mimedefang                v2.63
Spamassassin            v3.2.4
Sophie                         v23.05
Sophos                        v3.4 engine v4.5 IDE
Perl                              v5.8.5

Redhat                        5 - 64 bit
Mimedefang                v2.67
Spamassassin            v 3.2.5
Sophie                         v23.05
Sophos                        v3.4 engine v4.5 IDE
Perl                              v 5.10.1

RAM disk:
Filesystem            Size  Used Avail Use% Mounted on
none                  512M  156M  357M  31% /var/spool/MIMEDefang

CPU Average 30% and under
No swapping or paging occurs at all



Sample email transaction:

2010-03-25T01:12:09-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: Milter (mimedefang): init success to negotiate
2010-03-25T01:12:09-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: Milter: connect to filters
2010-03-25T01:12:09-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=connect, continue
2010-03-25T01:12:10-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=helo, continue
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=mail, continue
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Enter: filter_recipient
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: RCPT, RELAY=[xxxxxxx] [xxxxxxx], FROM=<xxxxxxxx>, TO=<xxxxxxxxxx>
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Entering: UVcheck_against_smtp_server_db
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Leaving: UVcheck_against_smtp_server_db
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Enter: relayIsLocal
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Leaving: relayIsLocal
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Enter: UVrelayIsAuth
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Enter: relayIsLocal
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Leaving: relayIsLocal
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Enter: UVrelayIsAuth
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.notice mimedefang.pl[12961]: o2P8C9ji002495: Scantime: filter_recipient(ok) 0.24 mailertable 0.00 UVrelay_is_blacklisted 0.24 spamhaterdb 0
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[12961]: o2P8C9ji002495: Leaving: filter_recipient
2010-03-25T01:12:11-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=rcpt, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: from=<xxxxxxxx>, size=1974, class=0, nrcpts=1, msgid=<201003250812.o2P8C9ji002495 at xxxx.xxx.xxx>, proto=SMTP, daemon=MTA, relay=[xxxxxxxxx]
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=header, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=eoh, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: milter=mimedefang, action=body, continue
2010-03-25T01:12:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang.pl[30356]: o2P8C9ji002495: Enter: filter_begin
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang-multiplexor[300]: Killing busy slave 76 (pid 30356): Busy timeout
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang-multiplexor[300]: stats 1269505032.784 KillSlave slave=76 nslaves=199 nbusy=198 reason="Busy timeout"
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info mimedefang-multiplexor[300]: stats 1269505032.784 EndFilter slave=76 nslaves=199 nbusy=198 numRequests=139 y=198 numRequests=139
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.err mimedefang[325]: Error from multiplexor: ERR Filter timed out - system may be overloaded (consider increasing busy timeout)
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.err sm-mta[2495]: o2P8C9ji002495: Milter read(mimedefang): timeout before data read
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: Milter (mimedefang): to error state
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: Milter: data, reject=451 4.3.2 Please try again later
2010-03-25T01:17:12-07:00 local at xxxx.xxx.xxx/xxxx.xxx.xxx mail.info sm-mta[2495]: o2P8C9ji002495: to=<xxxxxxxx>, delay=00:05:01, pri=31974, stat=Please try again later
-- 
William Parr                   




More information about the MIMEDefang mailing list