[Mimedefang] MsgID between sendmail and mimedefang issue
Bill Cole
mdlist-20140424 at billmail.scconsult.com
Thu Nov 18 10:48:50 EST 2021
On 2021-11-18 at 08:05:11 UTC-0500 (Thu, 18 Nov 2021 15:05:11 +0200)
Ernst du Plooy via MIMEDefang <mimedefang at lists.mimedefang.org>
is rumored to have said:
> I'm seeing weird errors on some instances on Mimedefang lately. This
> only
> happens once or twice a day. Notice the 3rd line from below. The
> mgsid (
> 1AH5u8x0021136) from sendmail is not the same as the id used for the
> temporary directory created by mimedefang (1AH5sPfg019398). It seems
> that
> the message id between sendmail and mimedefang gets mixed up.
Being intimately familiar with how Sendmail and MIMEDefang handle the
queueID, I feel confident in saying that this is NOT what is happening.
The Sendmail queueID is created by sendmail at runtime and is
meaningful: it is composed of a timestamp in a sendmail-specific format
and a sendmail process ID. MD cannot invent a queueID. 1AH5sPfg019398
was created by the sendmail process with PID 19398. 1AH5u8x0021136 was
created *almost 2 minutes later* by the sendmail process with PID 21136.
> Nov 17 07:54:25 xxx01 sendmail[19398]: 1AH5sPfg019398: Milter:
> to=<help at email.domain>, reject=554 5.7.1 Connection Refused: Relay
> Server
> (a.b.61.62) blacklisted
There should be something about 1AH5sPfg019398 logged before this. The
munging hurts here, as you've eliminated the relay host IP which might
be useful.
> Nov 17 07:54:25 xxx01 sendmail[19398]: 1AH5sPfg019398: lost input
> channel
> from mail-ct2zaf01lp2175.outbound.protection.outlook.com
> [104.47.19.175] to
> MTA after data
That MS machine dropped the connection before getting a reply to the
second step of the DATA command, i.e. after sending the message but
before getting an acknowledgement or rejection. Maybe it never saw the
rejection.
> Nov 17 07:54:25 xxx01 sendmail[19398]: 1AH5sPfg019398:
> from=<user at email.domain>, size=330137, class=0, nrcpts=0, proto=ESMTP,
> daemon=MTA, relay=mail-ct2zaf01lp2175.outbound.protection.outlook.com
> [104.47.19.175]
1:43 time gap... What happened here?
> Nov 17 07:56:08 xxx01 mimedefang.pl[10411]: 1AH5sPfg019398:
> filter_recipient tempfailed recipient <help at email.domain>
> Nov 17 07:56:08 xxx01 mimedefang.pl[10411]: 1AH5sPfg019398: Cannot
> open
> COMMANDS file from mimedefang: No such file or directory
Strange. Why does MD care about 1AH5sPfg019398 1:43 after Sendmail
disposed of it?
> Nov 17 07:56:08 xxx01 mimedefang.pl[10411]: 1AH5sPfg019398: SPF
> Result:
> Skip Greylist for ZAF01-CT2-obe.outbound.protection.outlook.com
That's a new hostname!
> Nov 17 07:56:08 xxx01 mimedefang.pl[10411]: 1AH5sPfg019398: SPF
> Result:
> Whitelisted SPF Cache 104.47.19.175 <aaronb at email.domain>
> ZAF01-CT2-obe.outbound.protection.outlook.com
??? The IP and hostname are inconsistent here. What is the line in
mimedefang-filter that it logging this?
I suspect that you have some custom code in mimedefang-filter for SPF
that is doing something slightly wrong and remembering a queueID for too
long. No idea HOW it's managing that.
> Nov 17 07:56:09 xxx01 sendmail[21136]: 1AH5u8x0021136: Milter:
> to=<ne at email.domain>, reject=451 4.3.0 could not
> chdir(/var/spool/MIMEDefang/mdefang-1AH5sPfg019398): No such file or
> directory
Here we have a new sendmail process with a new queueID and apparently a
confusion inside MD, which is passing back a message referencing the
working directory for the old, quite reasonably no longer existing,
queueID.
And another gap here. Only 14s but that's still oddly long. What
happened here?
> Nov 17 07:56:23 xxx01 mimedefang.pl[10411]: 1AH5sPfg019398: TEMPFAIL
> Nov 17 07:56:23 xxx01 mimedefang.pl[10411]: 1AH5sPfg019398: NOTICE:
> All
> Filter Recipient tests passed
Very weird. Whatever in your mimedefang-filter logged this thought it
was still working inside filter_recipient() with 1AH5sPfg019398, 1:58
after Sendmail itself finished working with that queueID, having gotten
to the point of DATA before the client disappeared. And it seems
confused about 'passing' whatever tests it is running in
filter_recipient().
> Any suggestions?
1. Look for the log entries in the time gaps and before the first one
above for any hints of what actually happened.
2. Look at your mimedefang-filter where it is logging SPF results and
figure out what it's doing with IPs and hostnames. It's getting
something wrong in there...
3. Figure out why MD is remembering an old message's working directory
almost 2 minutes after Sendmail believes it both (a) was told by a
milter that the sending IP is bad and (b) lost the connection
unceremoniously after DATA.
I don't have a coherent theory of what happened here, but I suspect that
it is hiding in the missing 1:43 and maybe the additional 0:14 later,
and possibly in whatever was logged BEFORE the first line above.
--
Bill Cole
bill at scconsult.com or billcole at apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
More information about the MIMEDefang
mailing list