Your comment made me do a deeper analysis of my prog and it turned out that the error with Encode::decode('utf8' occurs when it attempts to decode a string read from a file which contains already processed data (wide characters). This does not seem to bother utf8::decode, but hangs Encode::decode in the second round (why not in the first round???).

decoded twice with utf8::decode
sub my_decode { my $what=shift; my $comment=shift; my $what_old=$what; open STDERR,'>>decode_dump.txt'; print STDERR "\n---------------------\n$comment\n----------------- +----\nORIGINAL:\n---------------------\n"; Dump $what; utf8::decode($what);$what_decoded_once=$what; #$what_decoded_once=Encode::decode('utf8',$what); print STDERR "\n---------------------\nDECODED ONCE:\n------------ +---------\n"; Dump $what_decoded_once; utf8::decode($what);$what_decoded_twice=$what; #$what_decoded_twice=Encode::decode('utf8',$what_decoded_once); print STDERR "\n---------------------\nDECODED TWICE:\n----------- +----------\n"; Dump $what_decoded_twice; exit if( ($what_old ne $what_decoded_twice) && (!($what=~/[a-z]+/) +) && ($comment eq 'player white name') ); return $what_decoded_twice; } ######################################################## output: ######################################################## --------------------- player white name --------------------- ORIGINAL: --------------------- SV = PV(0x256fa7c) at 0xe3a62c REFCNT = 1 FLAGS = (PADMY,POK,pPOK) PV = 0x12d45bc "\320\245\320\260\321\207\320\260\321\202\321\203\321 +\200 \320\241\321\203\320\263\321\217\320\275"\0 CUR = 25 LEN = 28 --------------------- DECODED ONCE: --------------------- SV = PV(0x256fb14) at 0xe424cc REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x25ab78c "\320\245\320\260\321\207\320\260\321\202\321\203\321 +\200 \320\241\321\203\320\263\321\217\320\275"\0 [UTF8 "\x{425}\x{430 +}\x{447}\x{430}\x{442}\x{443}\x{440} \x{421}\x{443}\x{433}\x{44f}\x{4 +3d}"] CUR = 25 LEN = 28 --------------------- DECODED TWICE: --------------------- SV = PV(0x256fae4) at 0xe4251c REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x25ab9cc "\320\245\320\260\321\207\320\260\321\202\321\203\321 +\200 \320\241\321\203\320\263\321\217\320\275"\0 [UTF8 "\x{425}\x{430 +}\x{447}\x{430}\x{442}\x{443}\x{440} \x{421}\x{443}\x{433}\x{44f}\x{4 +3d}"] CUR = 25 LEN = 28
decoded twice with Encode::decode('utf8'
sub my_decode { my $what=shift; my $comment=shift; my $what_old=$what; open STDERR,'>>decode_dump.txt'; print STDERR "\n---------------------\n$comment\n----------------- +----\nORIGINAL:\n---------------------\n"; Dump $what; #utf8::decode($what);$what_decoded_once=$what; $what_decoded_once=Encode::decode('utf8',$what); print STDERR "\n---------------------\nDECODED ONCE:\n------------ +---------\n"; Dump $what_decoded_once; #utf8::decode($what);$what_decoded_twice=$what; $what_decoded_twice=Encode::decode('utf8',$what_decoded_once); print STDERR "\n---------------------\nDECODED TWICE:\n----------- +----------\n"; Dump $what_decoded_twice; exit if( ($what_old ne $what_decoded_twice) && (!($what=~/[a-z]+/) +) && ($comment eq 'player white name') ); return $what_decoded_twice; } ######################################################## output: ######################################################## --------------------- player white name --------------------- ORIGINAL: --------------------- SV = PV(0x256f0cc) at 0xe3a62c REFCNT = 1 FLAGS = (PADMY,POK,pPOK) PV = 0x2620f0c "\320\245\320\260\321\207\320\260\321\202\321\203\321 +\200 \320\241\321\203\320\263\321\217\320\275"\0 CUR = 25 LEN = 28 --------------------- DECODED ONCE: --------------------- SV = PV(0x256f0b4) at 0xe424bc REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x2771b74 "\320\245\320\260\321\207\320\260\321\202\321\203\321 +\200 \320\241\321\203\320\263\321\217\320\275"\0 [UTF8 "\x{425}\x{430 +}\x{447}\x{430}\x{442}\x{443}\x{440} \x{421}\x{443}\x{433}\x{44f}\x{4 +3d}"] CUR = 25 LEN = 28 Tk::Error: Cannot decode string with wide characters at C:/strawberry/ +perl/lib/Encode.pm line 174. Encode::decode at C:/strawberry/perl/lib/Encode.pm line 174 Library::my_decode at Library.pm line 321 GameBrowser::analyze_games_index_inner at GameBrowser.pm line 981 GameBrowser::__ANON__ at GameBrowser.pm line 828 GameBrowser::for_all_players at GameBrowser.pm line 813 GameBrowser::analyze_games_index at GameBrowser.pm line 831 GameBrowser::load_games_index at GameBrowser.pm line 1041 Tk callback for .toplevel.frame.frame.button Tk::__ANON__ at C:/strawberry/perl/site/lib/Tk.pm line 251 Tk::Button::butUp at C:/strawberry/perl/site/lib/Tk/Button.pm line 17 +5 <ButtonRelease-1> (command bound to event) Terminating on signal SIGHUP(1)

In reply to Re^2: Utf8 experts help! by chessgui
in thread Utf8 experts help! by chessgui

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.