[Mimedefang] MIME::Entity not handling Charset => 'utf-8' correctly?
Philip Prindeville
philipp_subx at redfish-solutions.com
Wed Feb 20 19:20:25 EST 2013
Hi.
I'm trying to generate a message as a footer in mimedefang-filter (in filter_end()) when I see certain message contents, but I'm running into what looks like a bug. I've reproduced it here:
[philipp]$ cat test.pl
#!/usr/bin/perl -w
use strict;
use warnings;
use MIME::Entity;
use MIME::QuotedPrint;
use HTML::Entities;
my $string = decode_qp("Ellipsis=E2=80=A6\n");
utf8::upgrade($string);
print "string: ", $string;
print "hex: ", unpack('H*', $string), "\n";
my $msg = encode_entities($string, '"<>&');
my @strings = (
"<html>\n",
$msg,
"</html>\n"
);
my $html = MIME::Entity->build(
Top => 0,
Type => 'text/html',
Encoding => 'quoted-printable',
Charset => 'utf-8',
Data => [ @strings ],
);
print $html->as_string(), "\n";
exit 0;
[philipp]$ ./test.pl
string: Ellipsis…
hex: 456c6c6970736973e280a60a
Content-Type: text/html; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
<html>
Ellipsis=C3=A2=C2=80=C2=A6
</html>
[philipp]$
from what I can tell, if I do a Data::Dumper() on $html->bodyhandle()->{'MBS_Data'} then it looks like the 3 UTF characters (0xa280a8) have been converted into \x{a2}, \x{80}, \x{a8} instead… Which I don't understand, since I've explicitly called the Charset out as being 'utf-8'. It looks like the string is being interpreted as latin1, not utf8.
What am I doing wrong?
Thanks,
-Philip
More information about the MIMEDefang
mailing list