[Mimedefang] Corrupted PDF files and a possible solution

Steve Whitehouse steve.whitehouse at blackspider.com
Mon Jun 2 11:25:01 EDT 2003


Hello,

I'm running Mimedefang 3.31 and have seen PDF files sent from certain
versions of Exchange Servers corrupted when passing though Mimedefang. The
problem is related to end of line separator mangling. I think I've worked
out what's happening and have applied a local fix, which I'm testing at the
moment. 

The fix looks to ensure the integrity of files encoded using QuotedPrintable
when the original files contain embedded carriage returns <CR> or line
feeds, <LF>. This fix seems to work fore me and I'd welcome the opinion of
others, both on the analysis and the proposed fix. So far the fix does not
appear to have introduced any side-effects, but if anyone can spot any I'd
love to know.

Regards,
Steve

Problem Analysis
----------------

1. The Original PDF has something like the following sequence within it:

A<LF>B<CR><LF>

2. The senders Exchange server is encoding PDFs using a quoted-printable
encoding. This appears to be the default for Exchange 5 and 5.5. Exchange
5/5.5. Exchange can be patched to send attachments using Base64, but I would
like to avoid this requirement if possible. It also looks like Exchange 2000
encodes these PDFs using Base64, which is more sensible).
Anyway, Within the MIME encoded message the original sequence becomes:

A=0AB<CR><LF>   

The embedded <LF> has been encoded as =0A

3. When the message is received, mimedefang (the binary) strips the <CR>s
from the body of the message received from Sendmail.

We now have:

A=0A=B<LF>

4. Within mimedefang.pl, MIME::Tools decodes the message and the PDF file's
Quoted-Printable encoding is decoded. (using MIME::QuotedPrint).

We know have:

A<LF>B<LF>

5. At this point, the PDF appears to be invalid. It cannot be read using
Acrobat on Linux. I suspect the <CRS> are significant.

6. When the message is re-built my Mimedefang (MIME::Tools etc) the PDF is
re-encoded using MIME::QuotedPrint. Now the embedded line feeds from the
original PDF are treated as End of Line markers and are not encoded.
Sendmail (I guess) inserts the required <CRs> for valid SMTP when the
message body is resubmitted. The encoded PDF file within the MIME message
now has the sequence:

A<CR><LF>B<CR><LF>

7. When the resultant message is received and the attachment opened on a
Windows PC, the PDF file is corrupt. The original sequence,  A<LF>B<CR><LF>
has become A<CR><LF>B<CR><LF>.

The Fix
-------

1. Inhibit mimedefang from stripping out CRs in the body of the message it
receives from Sendmail. 
   Comment out the 'continue' statement at ~line 954 in mimedefang.c,
function body()

2. Modify the encoding function in MIME::QuotedPrint, so that <CR> or <LF>
that are not used together as the end of line
   marker are encoded: 

   In /usr/lib/perl5/5.8.0/i386-linux-thread-multi/MIME/QuotedPrint.pm
Subroutine encode_qp()

   Change:
     $res =~
s/(\t\n!"#\$%&'()*+,\-.\/0-9:;<>?\@A-Z[\\\]^_`a-z{|}~])/sprintf("=%02X",
ord($1))/eg;  # rule #2,#3

   To:
     $res =~  s/(\r(?!\n)|(?<!\r)\n|[^
\t\r\n!"#\$%&'()*+,\-.\/0-9:;<>?\@A-Z[\\\]^_`a-z{|}~])/sprintf("=%02X",
ord($1))/eg;
                 ^^^^^^^^^^^^^^^^^^^     ^^
                      Added            Added








This message has been scanned for viruses by MailControl - www.mailcontrol.com

BlackSpider Technologies' MailControl is a fully integrated suite of managed services, which provides comprehensive e-mail security by blocking viruses and junk e-mail before they reach corporate networks 

Disclaimer 
 1. This e-mail may constitute privileged information. If you are not the intended recipient, you have received this confidential email and any attachments transmitted with it in error and you must not disclose, copy, circulate or in any other way use or rely on this information. 
 2. E-mails to and from the company are monitored for operational reasons and in accordance with lawful business practices.  
 3. The contents of this email represent the views of an individual and do not necessarily represent the views of the company. 
 4. The company does not conclude contracts by email and all negotiations are subject to contract.  
 5. The company accepts no responsibility once an e-mail and any attachments is sent. 






More information about the MIMEDefang mailing list