ggs has asked for the wisdom of the Perl Monks concerning the following question:

Hi I have a script that sends out email with asian characters in the message body. The subject line also contains asian characters (chinese to be specific). The message body is displayed in the correct character format but the subject line is not. Can u please tell me how i can indicate in the header so that the subject line also displays the characters properly? Thanks

Replies are listed 'Best First'.
Re: asian characters in email subject line
by shenme (Priest) on Aug 06, 2003 at 00:22 UTC
    You need to specify what mail client programs you will be using or will be targeting. My client, Eudora, doesn't seem to understand any of the subject lines I get in Chinese. (I made the mistake of using my real email address in Usenet chinese groups - bu xing, hen duo SPAM) However, I do see a _lot_ of things looking like:
    Subject: =?Big5?B?pXjGV7DP?= email =?Big5?B?pua+UKRqpf4=?= &
     
    Subject: =?GB2312?B?vKu+39T2s6THscGmtcSzr9H0uN+/xry8OqGw0tTIy8jPyMuhsQ==?=
    
    Subject: =?GB2312?B?xOO6w6Oh?=
    
    Subject: =?Big5?B?p1muybNxtfi2WrHQvsc=?=
    
    Beg pardon at what these might actually say. ;-)

    Google-wise, I see mention of similar constructs in http://www.ietf.org/rfc/rfc2231.txt
    Yumm, check out RFC 2047 at http://www.ietf.org/rfc/rfc2047.txt
    This second one is apparently the controlling RFC for what you are looking for.

    I'm a pessimist about probabilities; I'm an optimist about possibilities.   --   Lewis Mumford

Re: asian characters in email subject line
by Thelonius (Priest) on Aug 06, 2003 at 03:33 UTC
    There is an Encode::MIME::Header in the Encode package on CPAN, which is included (I think) in Perl 5.8. It will work with UTF-8. I don't see a module that will do other character sets.
      This looks like a winner for decoding. It handles the 'B' that indicates Base64 encoding. And it uses Encode's find_encoding to locate info on, say, 'Big5' charset.

      On encoding author has an explicit apology that he only does encoding to UTF-8. But he does say he believes all mail clients should handle UTF-8 encoding these days, so maybe this shouldn't be a problem.

      Hmmm, Encode says requires 5.7.3 or later. ggs can you use Perl 5.8 to run your program?

Re: asian characters in email subject line
by sgifford (Prior) on Aug 06, 2003 at 05:04 UTC

    Read RFC 2047.

    I don't know any asian languages, so I can't test, but appears that you can do this:

    Subject: =character-set?q?quoted=20printable=20text?=
    Subject: =character-set?b?base64encodedtext?=
    
    where character-set is the character set you want to use, q or b indicate whether what encoding you want to use, followed by the encoded text, either in quoted-printable or base64.

    There are examples near the bottom of the RFC, but unfortunately none are for Asian languages.

    Good luck!