[Mimedefang] Network issues causing broken pipe errors (and subsequent tempfails)?

Fri Feb 27 18:04:31 EST 2004

Last night I saw an MIMEDefang error in my mail logs that I have never
noticed before:

### TRACKING MESSAGE: i1R1dKT7023699
Feb 26 23:02:39 mx sendmail[23699]: i1R1dKT7023699:
from=<user1 at example.com>, size=14033627, class=0, nrcpts=2,
msgid=<7023F05B3EADD51184060008C75CA21EBED2FC at example>, proto=ESMTP,
daemon=MTA, relay=example.com [x.x.x.x]
Feb 26 23:02:39 mx sendmail[23699]: i1R1dKT7023699: Milter (mimedefang):
write(L) returned -1, expected 5: Broken pipe
Feb 26 23:02:39 mx sendmail[23699]: i1R1dKT7023699: Milter (mimedefang): to
error state
Feb 26 23:02:39 mx sendmail[23699]: i1R1dKT7023699: Milter: data, reject=451
4.7.1 Please try again later
Feb 26 23:02:39 mx sendmail[23699]: i1R1dKT7023699: to=<user2 at example.com>,
delay=03:22:54, pri=14093627, stat=Please try again later
Feb 26 23:02:39 mx sendmail[23699]: i1R1dKT7023699: to=<user3 at example.com>,
delay=03:22:54, pri=14093627, stat=Please try again later

After some more research I discovered that the broken pipe errors occur
somewhat regularly, and usually correspond with such sendmail errors as
"timeout waiting for input from servername during message collect".  What
made this one stand out is that it caused MD to tempfail the message.  (In
fact, the only reason I noticed it is that I have a script running that
alerts me when my mail exchanger tempfails a message for any reason).  Today
the relay tried to redeliver the message and the same error occurred.  The
message is quite large (around 14 MB), but I have successfully received
messages that were up to 20000000 bytes in size (my server's limit) without
issue.

While trying to troubleshoot the problem, I temporarily placed a check for
this particular relay in filter_relay() and had MD return accept and no more
filtering to sendmail just in case there was a problem with my filter that
was causing this broken pipe error.  After doing that I noticed the relay
once again tried to redeliver, but this time it failed with the "timeout
waiting for input from servername during message collect".  So apparently
there was some network issue between our mail servers that was causing the
message to timeout in transit.

I had the sender of this huge message send it to a different address of mine
to see if there was something in the message itself that was causing a
problem.  I received it and then did an MTA-level redirect through my MD box
and received it without any problem.  But I did the redirect from a host
that is on the same physical network as my box, so the transfer was very
fast (30 seconds versus 2 hours(!) for the original relay in question).

Basically I say all this to ask a question.  Is it possible that this
message is taking so long to transfer that the MD slave is dying before it
is fully received, and this is what is causing the broken pipe error?  I am
using the "-l" option to the multiplexor and it is not logging anything for
this message, and there are no log entries to indicate that the slave is
hitting is being killed, so I'm at a loss.  Normally I would not care about
this, if the slave just died and sendmail aborted, but MD is tempfailing
because of this error and it appears to the sender that this is a problem
with my mail server's filtering instead of a network issue.

I'd appreciate any insight that can be offered...

___________________________________________
Michael Sims
Project Analyst - Information Technology
Crye-Leike Realtors
Office: (901)758-5648  Pager: (901)769-3722
___________________________________________