[Mimedefang] curly braces

Hiroki Mori hiroki at klab.jp
Wed Aug 28 05:13:01 EDT 2002


Hi all,

  I have run MIMEDefang since last month with the minimum filter, and
found that a lot of sane mail with Word documents attached have been
quarantined because of its filename.
  Attachment with non-ASCII filename is so common in Japan, and
occasionally the curlies ("{" and "}") are included in the
filename. For example, the header would be like this:

Content-Type: application/msword
Content-Disposition: attachment;
	filename*=shift_jis''%8C%B4%97%9D%90%7D%2Edoc

Actually, the filename consists of three wide characters followed by
".doc" but the filter does not know about the charset and simply
rejects it as it contains %7D, namely, "}".
  This happens because the "shift_jis" encoding partially shares the
same plane with US-ASCII. Situation gets worse if an MUA encodes
non-ASCII filename with ISO 2022, where all characters are represented
by 7-bit unit.

Content-Disposition: attachment;
	filename*=iso-2022-jp''%1B$B86M%7D%3F^%1B%28B%2Edoc

The example above represents semantically identical filename to the
previous one. Unfortunately, the filename cannot escape from "}"
regardless of the encoding. :-(
  Theoretically, encodings such as EUC or Unicode would be safe. But
in practice, most of MUAs use the problematic encodings for
attachments.

  Finally I have decided to modify mimedefang-filter as:

    # Do not allow:
    # - bad extensions (possibly with trailing dots) at end or
    #   followed by non-alphanum
    $re = '(\.' . $bad_exts . ')\.*([^-A-Za-z0-9_.]|$)';

but I suspect there might be a better solution.
  Any comments and suggestions are welcome.

Hiroki Mori (hiroki at klab.jp)
Utsunomiya University



More information about the MIMEDefang mailing list