in reply to reading Base64 encoded Unicode email returns mangled text

The documentation warns against using decode_mimewords in scalar context. In fact, it suggests against using decode_mimewords in favour of unmime in MIME::WordDecoder.

If you were to use decode_mimewords, seems to me the proper usage would be:

my $subj = ''; foreach (decode_mimewords($mail->get('Subject'))) { my ($data, $charset) = @$_; if (defined($charset)) { $subj .= decode($charset, $data); } else { $subj .= $data; } }

Replies are listed 'Best First'.
Re^2: reading Base64 encoded Unicode email returns mangled text
by Errto (Vicar) on Mar 19, 2007 at 15:51 UTC
    I guess that makes sense. I was kind of assuming that there would be some method that would take a MIME-encoded ASCII string, with possibly different components in different encodings, and return me a normal Perl string, with whatever conversions done as needed. Based on earlier experiments it didn't look like MIME::WordDecoder actually handles UTF-8 correctly though I suppose I could try again. Update: Indeed it does not: calling unmime on the string from the OP returns a bunch of question marks.
      The following should work, no matter what encoding was used by the source string.
      MIME::WordDecoder->default(MIME::WordDecoder->supported("UTF-8")); $subj = decode('UTF-8', unmime($subj));

      Update: ug! Don't use MIME::WordDecoder. After looking at its guts, I'd recommend the snippet I presented earlier (packaged as a reusable function below). I think this toolkit predates the addition of UNICODE support to Perl. That would explain why the weird and convoluted interface.

      use MIME::Words qw( decode_mimewords ); use Encode qw( decode ); sub mime_decode { my $decoded = ''; foreach (decode_mimewords($_[0])) { my ($data, $charset) = @$_; if (defined($charset)) { $decoded .= decode($charset, $data); } else { $decoded .= $data; } } return $decoded; } $subj = mime_decode($subj);

      Untested