[Mimedefang] Re: sudden "Too many open files" errors
Juergen Georgi
georgi at belwue.de
Fri Apr 20 10:41:17 EDT 2007
On Apr 18, 10:56, Juergen Georgi wrote:
> Subject: [Mimedefang] sudden "Too many open files" errors
>
> yesterday about noon, MIMEDefang started to complain about
> "Too many open files". The log shows error messages like:
>
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 384225 local0.warning] l3I7Q3Kf006561: Could not open /var/run/MIMEDefang/mdefang-l3I7Q3Kf006561/HEADERS: Too many open files
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 384225 local0.warning] l3I7Q3Kf006561: Could not open /var/run/MIMEDefang/mdefang-l3I7Q3Kf006561/COMMANDS: Too many open files
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 934268 local0.warning] opendir(/var/run/MIMEDefang/mdefang-l3I7Q3Kf006561) failed: Too many open files
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 828695 local0.error] l3I7Q3Kf006561: failed to clean up /var/run/MIMEDefang/mdefang-l3I7Q3Kf006561: Too many open files
>
> In /var/run/MIMEDefang I find are tons (> 10000) of left-over
> mdefang-<qid> directories. lsof reports that mimdefang[24437]
> has nearly 1000 files open (fd limit is 1024).
>
> I looked for those open files: 385 open connections on mimedefang.sock,
> and 606 files in mdefang-<qid> directories. Here an example:
>
> # date
> Wed Apr 18 09:56:53 MEST 2007
>
> # ls -l /var/run/MIMEDefang/mdefang-l3I7IKjT002406/*
> -rw-r----- 1 defang defang 472 Apr 18 09:18 /var/run/MIMEDefang/mdefang-l3I7IKjT002406/COMMANDS
> -rw-r----- 1 defang defang 0 Apr 18 09:18 /var/run/MIMEDefang/mdefang-l3I7IKjT002406/HEADERS
> So why is mimedefang keeping these files open for more than
> half an hour? The strange thing is: I see no trace of queue
> id l3I7IKjT002406 in sendmail's or md's log.
I did not wait long enough. l3I7IKjT002406 appeared exactly
1 hour later:
Apr 18 10:18:31 smtp5.BelWue.DE sm-mta[2406]: [ID 801593 mail.crit] l3I7IKjT002406: SYSERR(root): collect: read timeout on connection from 62.43.196.178.static.user.ono.com, from=<Maks at antra-ag.de>
Apr 18 10:18:31 smtp5.BelWue.DE sm-mta[2406]: [ID 801593 mail.info] l3I7IKjT002406: from=<Maks at antra-ag.de>, size=2851, class=0, nrcpts=1, proto=ESMTP, daemon=MTA-v4, relay=62.43.196.178.static.user.ono.com [62.43.196.178]
I was seeing timeouts in the SMTP DATA phase. The problem
stopped on all our three gateways yesterday around 4 p.m.
I will lower sendmail's Timeout.datablock value. The default of
1h can lead to a denial-of-service situation. I still have no
idea what caused these excessive timeouts. It looks like a network
problem, although our network engineers had no evidence of such
a situation.
Best regards,
-Juergen
More information about the MIMEDefang
mailing list