[Mimedefang] mimefang hangs, goes into CPU loop under high load

Tue Feb 25 07:16:02 EST 2003

Hello All,

I've been doing some stress testing, using mimedefang 2.28, Sendmail
8.12.6, RedHat 7.3. If I bombard my test server with email, I quickly
get to situation where mimedefang hangs, blocking inbound mail. From the
logs, the problem starts with a thread creation error: 

Feb 25 11:13:28 rhserver3 mimedefang[13267]: MIMEDefang-2.28:
thread_create() failed: 11, try again

Running out of threads would be OK if the error was handled more
gracefully, and the system recovered. I appreciate that the thread
creation is done in libmilter rather than mimedefang, bit I thought I
would see if anyone else has seen these problems and had any ideas. 

Output from strace around this failure is:

11:13:28 select(6, [5], NULL, [5], {5, 0}) = 1 (in [5], left {4,
850000}) 11:13:28 accept(5, {sin_family=AF_UNIX, path=@}, [2]) = 505
11:13:28 setsockopt(505, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0 11:13:28
rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM RTMIN], 8) = 0 11:13:28
write(4, "\240p\3@\0\0\0\0\0\0\0\0\0\323\4\10\300#P\247\3@\0\200"...,
148) = 148 11:13:28 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM
RTMIN], 8) = 0 11:13:28 rt_sigsuspend([HUP INT TERM] <unfinished ...>
11:13:28 --- SIGRTMIN (Real-time signal 0) --- 11:13:28 <...
rt_sigsuspend resumed> )  = -1 EINTR (Interrupted system call)
11:13:28 sigreturn()                    = ? (mask now [HUP INT TERM
RTMIN])
11:13:28 time([1046171608])             = 1046171608
11:13:28 open("/etc/localtime", O_RDONLY) = 506
11:13:28 fstat64(506, {st_mode=S_IFREG|0644, st_size=1323, ...}) = 0
11:13:28 mmap2(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000 11:13:28 read(506,
"TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0\7\0"..., 4096) =
1323
11:13:28 close(506)                     = 0
11:13:28 munmap(0x40017000, 4096)       = 0
11:13:28 getpid()                       = 13267
11:13:28 rt_sigaction(SIGPIPE, {0x4002ef00, [], 0x4000000}, {SIG_IGN},
8) = 0 11:13:28 socket(PF_UNIX, SOCK_DGRAM, 0) = 506 11:13:28
fcntl64(506, F_SETFD, FD_CLOEXEC) = 0 11:13:28 connect(506,
{sin_family=AF_UNIX, path="/dev/log"}, 16) = 0 11:13:28 send(506,
"<19>Feb 25 11:13:28 mimedefang[1"..., 93, 0) = 93 11:13:29
rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0 11:13:29 select(0, NULL,
NULL, NULL, {1, 0}) = 0 (Timeout) 11:13:30 select(0, NULL, NULL, NULL,
{0, 0}) = 0 (Timeout) 11:13:30 select(0, NULL, NULL, NULL, {0, 0}) = 0
(Timeout) 11:13:30 select(0, NULL, NULL, NULL, {0, 0}) = 0 (Timeout)
11:13:30 select(0, NULL, NULL, NULL, {0, 0}) = 0 (Timeout) 11:13:30
select(0, NULL, NULL, NULL, {0, 0}) = 0 (Timeout)

>From this point on mimedafang goes into a tight CPU loop, and will not
process any further requests from sendmail.

Looking at the number of sendmail processes outstanding, it looks like
there is a 256 thread limit. I appreciate that the processing 256
concurrent messages is extreme, bit it is important to me that the
servers do not fail under high load, and I don't want to throttle back
sendmail connections or such like to limit the problem. 

One other question, does anyone have any ideas for possibly deferring
milter processing when under high load, without refusing to accept
inbound messages. The milters run during the SMTP transaction with the
mail server, therefore queuing the mail internally when the load
increases still runs the milters which (in my case) are significant
causes of the load!

I would appreciate any ideas.

Many Thanks
Steve