Errto has asked for the wisdom of the Perl Monks concerning the following question:
Folks,
I'm writing mail-processing programs that need to take incoming emails and store them for later use. The part I'm trying to nail down right now is interpreting Unicode strings correctly. Here's a little snippet of a program to read an incoming message in the usual (RFC822) format:
use strict; use warnings; use Mail::Audit; use MIME::Words qw(decode_mimewords); use Encode; open my $out, '>:utf8', "/some/file.txt"; my $mail = new Mail::Audit; my $subj = $mail->get('Subject'); $subj = decode_utf8 decode_mimewords($subj); print $out "$subj\n";
The part that seems odd to me is the decode_utf8 part. I wasn't expecting to need that. But if I don't have it, I get the wrong output. For example, here is the raw version of the relevant field (the word in question is the Russian word for "Russian", ie. Русский):
If I run my program on it without the decode_utf8 part, I getSubject: =?UTF-8?B?0KDRg9GB0YHQutC40Lk=?=
whereas with it, I get the result I want. Am I doing something wrong here, or do I just misunderstand how these modules work?Π ΡΡΡΠΊΠΈΠΉ
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: reading Base64 encoded Unicode email returns mangled text
by ikegami (Patriarch) on Mar 19, 2007 at 04:45 UTC | |
by Errto (Vicar) on Mar 19, 2007 at 15:51 UTC | |
by ikegami (Patriarch) on Mar 19, 2007 at 17:13 UTC | |
|
Re: reading Base64 encoded Unicode email returns mangled text
by GrandFather (Saint) on Mar 19, 2007 at 02:51 UTC | |
|
Re: reading Base64 encoded Unicode email returns mangled text
by Juerd (Abbot) on Jun 13, 2007 at 19:28 UTC |