[Mimedefang] curly braces
    Hiroki Mori 
    hiroki at klab.jp
       
    Wed Aug 28 05:13:01 EDT 2002
    
    
  
Hi all,
  I have run MIMEDefang since last month with the minimum filter, and
found that a lot of sane mail with Word documents attached have been
quarantined because of its filename.
  Attachment with non-ASCII filename is so common in Japan, and
occasionally the curlies ("{" and "}") are included in the
filename. For example, the header would be like this:
Content-Type: application/msword
Content-Disposition: attachment;
	filename*=shift_jis''%8C%B4%97%9D%90%7D%2Edoc
Actually, the filename consists of three wide characters followed by
".doc" but the filter does not know about the charset and simply
rejects it as it contains %7D, namely, "}".
  This happens because the "shift_jis" encoding partially shares the
same plane with US-ASCII. Situation gets worse if an MUA encodes
non-ASCII filename with ISO 2022, where all characters are represented
by 7-bit unit.
Content-Disposition: attachment;
	filename*=iso-2022-jp''%1B$B86M%7D%3F^%1B%28B%2Edoc
The example above represents semantically identical filename to the
previous one. Unfortunately, the filename cannot escape from "}"
regardless of the encoding. :-(
  Theoretically, encodings such as EUC or Unicode would be safe. But
in practice, most of MUAs use the problematic encodings for
attachments.
  Finally I have decided to modify mimedefang-filter as:
    # Do not allow:
    # - bad extensions (possibly with trailing dots) at end or
    #   followed by non-alphanum
    $re = '(\.' . $bad_exts . ')\.*([^-A-Za-z0-9_.]|$)';
but I suspect there might be a better solution.
  Any comments and suggestions are welcome.
Hiroki Mori (hiroki at klab.jp)
Utsunomiya University
    
    
More information about the MIMEDefang
mailing list