in reply to Re: reading Base64 encoded Unicode email returns mangled text
in thread reading Base64 encoded Unicode email returns mangled text

I guess that makes sense. I was kind of assuming that there would be some method that would take a MIME-encoded ASCII string, with possibly different components in different encodings, and return me a normal Perl string, with whatever conversions done as needed. Based on earlier experiments it didn't look like MIME::WordDecoder actually handles UTF-8 correctly though I suppose I could try again. Update: Indeed it does not: calling unmime on the string from the OP returns a bunch of question marks.

Replies are listed 'Best First'.
Re^3: reading Base64 encoded Unicode email returns mangled text
by ikegami (Patriarch) on Mar 19, 2007 at 17:13 UTC
    The following should work, no matter what encoding was used by the source string.
    MIME::WordDecoder->default(MIME::WordDecoder->supported("UTF-8")); $subj = decode('UTF-8', unmime($subj));

    Update: ug! Don't use MIME::WordDecoder. After looking at its guts, I'd recommend the snippet I presented earlier (packaged as a reusable function below). I think this toolkit predates the addition of UNICODE support to Perl. That would explain why the weird and convoluted interface.

    use MIME::Words qw( decode_mimewords ); use Encode qw( decode ); sub mime_decode { my $decoded = ''; foreach (decode_mimewords($_[0])) { my ($data, $charset) = @$_; if (defined($charset)) { $decoded .= decode($charset, $data); } else { $decoded .= $data; } } return $decoded; } $subj = mime_decode($subj);

    Untested