[Mimedefang] Details on 2.22 heavy-load crash scenario

Fri Oct 18 12:44:01 EDT 2002

On Fri, 18 Oct 2002, Evan Cofsky wrote:

> Do you have any more details?  It would usually happen for me when the
> load average was around 10-20, which is common on our mail gateway,

Yup; that's when it would happen.

The problem is caused by the following sequence of events.  You need
to be looking at the source code for MIMEDefang 2.22 and have
a good understanding of C to follow the rest of the discussion:

- If the filter times out, then handleSlaveReceivedAnswer starts
an event to write the message "ERR Filter timed out" back to
mimedefang.  Note that it passes NULL as the "completion callback
function" on both lines 1080 and 1088.

- If the system is so heavily loaded that the message does not make it
back to mimedefang before Settings.clientTimeout seconds elapses, then
handle_writable (event_tcp.c, line 264) ends up being called with
the EVENT_FLAG_TIMEOUT bit set.  On line 277 of that function, we
try to call the completion callback function, which is NULL, so we
crash.  Note that normally, this short reply from the multiplexor back
to mimedefang does not time out, and the only other parts of
handle_writable which try to call the callback function first check if
it's NULL (lines 289 and 303).

- If you absolutely cannot upgrade MIMEDefang, you can reduce the
likelihood of the crash by increasing the Settings.clientTimeout.
The default is 10 seconds; you can increase it with the "-c" option
to mimdefang-multiplexor.  A setting of 120 should make the crash
situation extremely unlikely.

Regards,

David.