mrguy123 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
I am trying to send a mail in Chinese, using Net::SMTP. All the Chinese is in UTF-8 format, so I don't need a special encoding. It's enough to add "charset=utf-8" and the body of the mail is sent correctly (example below).
The problem is that although the body is sent correctly, the subject doesn't known that it is supposed to be in utf-8.
If you run the code below, you should get a mail with malformed letters in the subject and perfect Chinese in the body
#!/usr/bin/perl use strict; use warnings; use Net::SMTP; my $SMTP_HOST = 'localhost'; my $from = 'aaa@hotmail.com'; my $to = 'bbb@gmail.com'; my $msg = "MIME-Version: 1.0\n" . "Content-Type: text/plain; charset=UTF-8\n" . "Content-Transfer-Encoding: 8bit\n" . "From: $from\n" . "To: $to\n" . "Date: Sun, 23 Mar 2008 15:47:22 +0200\n" ##the subject . "Subject: 中åç©æµä¸éè´­èåä¼\n\n" ##the body ."中åç©æµä¸éè´­èåä¼\n"; #Creating the SMTP object my $smtp = Net::SMTP->new( $SMTP_HOST,'Debug' => 1); #Sending the data $smtp->mail($from); $smtp->recipient($to); $smtp->data ($msg); $smtp->quit;
Does anyone have any idea how I can "tell" the subject it is also under utf-8?
Thanks a lot,
Guy Naamati

P.S. For some reason the utf-8/Chinese characters get changed when you copy paste them to VI. If you have this problem you can first copy the chars to a notepad and then to VI.

"Happiness is a warm gun"
--John Lennon

Replies are listed 'Best First'.
Re: Encoding Mail Subject
by moritz (Cardinal) on Mar 23, 2008 at 15:53 UTC
    Just as a short explanation why the subject won't work with simple UTF-8 encoding: The subject line is transmitted in the header, and all headers need to be ASCII.

    This is because mail server mostly care only about the headers, and keeping them in clean ASCII removes the woes of the MTA authors to handle multiple encodings (and multi byte encodings).

    P.S. For some reason the utf-8/Chinese characters get changed when you copy paste them to VI. If you have this problem you can first copy the chars to a notepad and then to VI.

    Maybe your vi(m) isn't configured properly? You can use :set encoding=utf8 to fore the use of utf-8, and the option fileencoding can controls the encoding in which the file is stored to disk. If these options are different, vim converts between them automatically.

    So far I had only problems with vim when a file had mixed charsets (like latin1 + utf8).

Re: Encoding Mail Subject
by linuxer (Curate) on Mar 23, 2008 at 15:21 UTC

    Iirc you can provide the encoding for the subject inside the subject (and encode the original subject string in a proper way):

    See http://en.wikipedia.org/wiki/MIME#Encoded-Word

    Subject: =?utf-8?Q?=C2=A1Hola,_se=C3=B1or!?=

    is interpreted as "Subject: ¡Hola, señor!".

    update: wiki url changed to link

      Thanks for the tip. Unfortunately, when I add
      "Subject: =?utf-8?Q?中åç©æµä¸éè´­èåä¼?="
      I still don't get a valid subject (just ??)
        As linuxer showed in the example above, the non-ASCII characters must be encoded too, so the full subject is ASCII only.
Re: Encoding Mail Subject
by oko1 (Deacon) on Mar 23, 2008 at 15:44 UTC

    Try MIME::Words.

    #!/usr/bin/perl -w use MIME::Words qw/encode_mimewords/; print encode_mimewords("Hola, Señor!");

    The result of the above is

    Hola, =?ISO-8859-1?Q?Se=C3=B1or?=!
      I will look into MIME::Words, but since it is a non standard module it will take time to move it to customers. I need to look for something standard in the meanwhile.