Folks,
I'm writing mail-processing programs that need to take incoming emails and store them for later use. The part I'm trying to nail down right now is interpreting Unicode strings correctly. Here's a little snippet of a program to read an incoming message in the usual (RFC822) format:
use strict; use warnings; use Mail::Audit; use MIME::Words qw(decode_mimewords); use Encode; open my $out, '>:utf8', "/some/file.txt"; my $mail = new Mail::Audit; my $subj = $mail->get('Subject'); $subj = decode_utf8 decode_mimewords($subj); print $out "$subj\n";
The part that seems odd to me is the decode_utf8 part. I wasn't expecting to need that. But if I don't have it, I get the wrong output. For example, here is the raw version of the relevant field (the word in question is the Russian word for "Russian", ie. Русский):
If I run my program on it without the decode_utf8 part, I getSubject: =?UTF-8?B?0KDRg9GB0YHQutC40Lk=?=
whereas with it, I get the result I want. Am I doing something wrong here, or do I just misunderstand how these modules work?Π ΡΡΡΠΊΠΈΠΉ
In reply to reading Base64 encoded Unicode email returns mangled text by Errto
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |