Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^4: utf8 in perl

by theravadamonk (Scribe)
on Jul 05, 2018 at 16:57 UTC ( [id://1217964]=note: print w/replies, xml ) Need Help??


in reply to Re^3: utf8 in perl
in thread utf8 in perl

thanks a lot for your wonderful code. I can display some subjects with NON ascii stuffs. But I still can NOT display some.

For e.g - I tried with below 2 subjects. I added below subjects to your code. It will NOT work. I can't think Why? Can You try?

my $subject = "Last Day To Enjoy Extra 15% OFF On Everything For NDB C +redit Cards (raw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20O +FF=20On=20Everything=20For=20NDB=20Credit=20Cards)";
my $subject = "Sing Along and Dance with Desmond De Silva at Pegasus R +eef Hotel! (raw: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmon +d=20De=20Silva=20at=20Pegasus=20Reef=20Hotel=21?)";

I did below exercise too.

I created a file /tmp/test. contents of /tmp/test

Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (ra +w: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Every +thing=20For=20NDB=20Credit=20Cards) Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (raw +: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva= +20at=20Pegasus=20Reef=20Hotel=21?)

I ran below command. it also Will NOT work.

# cat /tmp/test | perl -MEncode=decode -ne 'print (decode("MIME-Header +", "$_"))' Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (ra +w: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Every +thing=20For=20NDB=20Credit=20Cards) Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (raw +: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva= +20at=20Pegasus=20Reef=20Hotel=21?)

Any Idea?

Replies are listed 'Best First'.
Re^5: utf8 in perl
by hippo (Bishop) on Jul 05, 2018 at 17:15 UTC
    It will NOT work. I can't think Why?

    Because the MIME-encoded parts are not properly terminated. GIGO. They should end with ?=. By changing them to ensure that they are properly terminated they work fine.

    #!/usr/bin/env perl use strict; use warnings; use Encode qw(encode decode); my @subject = ("Room Rush \303\242\302\200\302\223 Enjoy 25% off on yo +ur stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay. +)", "Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (r +aw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Ever +ything=20For=20NDB=20Credit=20Cards?=)", "Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (ra +w: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva +=20at=20Pegasus=20Reef=20Hotel=21?=)"); for my $subj (@subject) { my $decoded = decode ("MIME-Header", $subj); print encode ("UTF-8", $decoded) . "\n"; }

      2 days ago, I got an email with below subject

      my $subject = "=?GB18030?B?XXXXXXXXXX?=";

      then, When I ran the code, I got below error

      Unknown encoding "GB18030" at /usr/lib64/perl5/Encode.pm line 174

      I searched it. Then I came to know that GB 18030 is a Chinese government standard.

      I found below 2 Urls

      https://perldoc.perl.org/Encode/CN.html

      https://stackoverflow.com/questions/6105316/how-to-convert-from-gbk-encoding-to-utf-8-encoding-in-perl

      they talk about Encode::HanExtra

      So, Added below line to my code. then, It started working.

      use Encode::HanExtra;

      So, Here's my UPDATED code. I think it's worth sharing...

      #!/usr/bin/perl use strict; use warnings; use Encode qw(encode decode); use Encode::HanExtra; no warnings 'utf8'; my @subject = ("Room Rush \303\242\302\200\302\223 Enjoy 25% off on yo +ur stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay. +)", "Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (r +aw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Ever +ything=20For=20NDB=20Credit=20Cards?=)", "Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (ra +w: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva +=20at=20Pegasus=20Reef=20Hotel=21?=)", "RE: Weekly Report (CommercialLegal) -\303\202\302\240 2 July 2018 to +5 July 2018 (raw: =?UTF-8?Q?RE:_Weekly_Report_=28CommercialLeg?=\t=?U +TF-8?Q?al=29_-=C2=A0_2_July_2018_to_5_July_201?=\t=)", "How are you", "=?GB18030?B?XXXXXXXXXX?="); for my $subj (@subject) { my $subject_decoded = decode("MIME-Header", $subj); print $subject_decoded; print "\n"; }

      Have a nice day to all Perl Monks

      Thanks for enlightening me. Since I was busy, I had to take much time to reply you. Sorry for it. Thanks everyone for their wonderful efforts.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1217964]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-25 07:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found