[Mimedefang] curly braces
Hiroki Mori
hiroki at klab.jp
Wed Aug 28 05:13:01 EDT 2002
Hi all,
I have run MIMEDefang since last month with the minimum filter, and
found that a lot of sane mail with Word documents attached have been
quarantined because of its filename.
Attachment with non-ASCII filename is so common in Japan, and
occasionally the curlies ("{" and "}") are included in the
filename. For example, the header would be like this:
Content-Type: application/msword
Content-Disposition: attachment;
filename*=shift_jis''%8C%B4%97%9D%90%7D%2Edoc
Actually, the filename consists of three wide characters followed by
".doc" but the filter does not know about the charset and simply
rejects it as it contains %7D, namely, "}".
This happens because the "shift_jis" encoding partially shares the
same plane with US-ASCII. Situation gets worse if an MUA encodes
non-ASCII filename with ISO 2022, where all characters are represented
by 7-bit unit.
Content-Disposition: attachment;
filename*=iso-2022-jp''%1B$B86M%7D%3F^%1B%28B%2Edoc
The example above represents semantically identical filename to the
previous one. Unfortunately, the filename cannot escape from "}"
regardless of the encoding. :-(
Theoretically, encodings such as EUC or Unicode would be safe. But
in practice, most of MUAs use the problematic encodings for
attachments.
Finally I have decided to modify mimedefang-filter as:
# Do not allow:
# - bad extensions (possibly with trailing dots) at end or
# followed by non-alphanum
$re = '(\.' . $bad_exts . ')\.*([^-A-Za-z0-9_.]|$)';
but I suspect there might be a better solution.
Any comments and suggestions are welcome.
Hiroki Mori (hiroki at klab.jp)
Utsunomiya University
More information about the MIMEDefang
mailing list