[Mimedefang] MIME::Entity not handling Charset => 'utf-8' correctly?

Philip Prindeville philipp_subx at redfish-solutions.com
Wed Feb 20 19:20:25 EST 2013


Hi.

I'm trying to generate a message as a footer in mimedefang-filter (in filter_end()) when I see certain message contents, but I'm running into what looks like a bug.  I've reproduced it here:

[philipp]$ cat test.pl
#!/usr/bin/perl -w

use strict;
use warnings;

use MIME::Entity;
use MIME::QuotedPrint;
use HTML::Entities;

my $string = decode_qp("Ellipsis=E2=80=A6\n");

utf8::upgrade($string);

print "string: ", $string;

print "hex: ", unpack('H*', $string), "\n";

my $msg = encode_entities($string, '"<>&');

my @strings = (
"<html>\n",
$msg,
"</html>\n"
);


my $html = MIME::Entity->build(
  Top => 0,
  Type => 'text/html',
  Encoding => 'quoted-printable',
  Charset => 'utf-8',
  Data => [ @strings ],
);

print $html->as_string(), "\n";

exit 0;
[philipp]$ ./test.pl
string: Ellipsis…
hex: 456c6c6970736973e280a60a
Content-Type: text/html; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

<html>
Ellipsis=C3=A2=C2=80=C2=A6
</html>

[philipp]$ 


from what I can tell, if I do a Data::Dumper() on $html->bodyhandle()->{'MBS_Data'} then it looks like the 3 UTF characters (0xa280a8) have been converted into \x{a2}, \x{80}, \x{a8} instead…  Which I don't understand, since I've explicitly called the Charset out as being 'utf-8'. It looks like the string is being interpreted as latin1, not utf8.

What am I doing wrong?

Thanks,

-Philip




More information about the MIMEDefang mailing list