[Mimedefang] Re: sudden "Too many open files" errors

Juergen Georgi georgi at belwue.de
Fri Apr 20 10:41:17 EDT 2007


On Apr 18, 10:56, Juergen Georgi wrote:
> Subject: [Mimedefang] sudden "Too many open files" errors
> 
> yesterday about noon, MIMEDefang started to complain about 
> "Too many open files". The log shows error messages like:
> 
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 384225 local0.warning] l3I7Q3Kf006561: Could not open /var/run/MIMEDefang/mdefang-l3I7Q3Kf006561/HEADERS: Too many open files
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 384225 local0.warning] l3I7Q3Kf006561: Could not open /var/run/MIMEDefang/mdefang-l3I7Q3Kf006561/COMMANDS: Too many open files
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 934268 local0.warning] opendir(/var/run/MIMEDefang/mdefang-l3I7Q3Kf006561) failed: Too many open files
> Apr 18 09:26:14 smtp5.BelWue.DE mimedefang[24437]: [ID 828695 local0.error] l3I7Q3Kf006561: failed to clean up /var/run/MIMEDefang/mdefang-l3I7Q3Kf006561: Too many open files
> 
> In /var/run/MIMEDefang I find are tons (> 10000) of left-over 
> mdefang-<qid> directories. lsof reports that mimdefang[24437] 
> has nearly 1000 files open (fd limit is 1024).
> 
> I looked for those open files: 385 open connections on mimedefang.sock, 
> and 606 files in mdefang-<qid> directories. Here an example:
> 
> # date
> Wed Apr 18 09:56:53 MEST 2007
> 
> # ls -l /var/run/MIMEDefang/mdefang-l3I7IKjT002406/*
> -rw-r-----   1 defang   defang       472 Apr 18 09:18 /var/run/MIMEDefang/mdefang-l3I7IKjT002406/COMMANDS
> -rw-r-----   1 defang   defang         0 Apr 18 09:18 /var/run/MIMEDefang/mdefang-l3I7IKjT002406/HEADERS


> So why is mimedefang keeping these files open for more than 
> half an hour? The strange thing is: I see no trace of queue 
> id l3I7IKjT002406 in sendmail's or md's log. 

I did not wait long enough. l3I7IKjT002406 appeared exactly
1 hour later:

Apr 18 10:18:31 smtp5.BelWue.DE sm-mta[2406]: [ID 801593 mail.crit] l3I7IKjT002406: SYSERR(root): collect: read timeout on connection from 62.43.196.178.static.user.ono.com, from=<Maks at antra-ag.de>
Apr 18 10:18:31 smtp5.BelWue.DE sm-mta[2406]: [ID 801593 mail.info] l3I7IKjT002406: from=<Maks at antra-ag.de>, size=2851, class=0, nrcpts=1, proto=ESMTP, daemon=MTA-v4, relay=62.43.196.178.static.user.ono.com [62.43.196.178]

I was seeing timeouts in the SMTP DATA phase. The problem
stopped on all our three gateways yesterday around 4 p.m.

I will lower sendmail's Timeout.datablock value. The default of 
1h can lead to a denial-of-service situation. I still have no 
idea what caused these excessive timeouts. It looks like a network 
problem, although our network engineers had no evidence of such
a situation. 

Best regards,

-Juergen



More information about the MIMEDefang mailing list