in reply to Perl/Tk: utf8 in Text widget?

OK, I'm in UTF-8 hell here. I thought that I would at least verify that I could read and write UTF-8 strings to my database, but while working on that, I discovered that I can't even get a trivial Perl program to run understandably.

Here's my test program:

#!/usr/bin/perl -w use strict; use utf8; my ( $codepoint, $test_string, ); binmode STDOUT, ":utf8"; $codepoint = ord('我'); print "Codepoint of character is $codepoint\n"; $test_string = "Here's a test string with 我\n"; print "test_string is $test_string\n"; $test_string_dec = decode('utf8', $test_string);
(In my original, I had the literal Chinese character 我 where you see the big numeric constant.)

There are at least two issues here:

  1. The presence of the "use utf8" pragma: I gather from Googling that this is no longer required in current versions of Perl (I'm using 5.8.8.) But, if I leave it out, the codepoint reported by "ord" is 250, instead of 25105. Surely Perl should know that the Chinese character is Unicode?
  2. The "binmode" statement: This interacts with the "use utf8" pragma in the following ways:
    both present: correct codepoint, correct output character, no error message
    pragma present, binmode omitted: correct codepoint, correct output character, "Wide character in print" error message.
    pragma omitted, binmode present: wrong codepoint, wrong output character, no error message
    Both omitted: wrong codepoint, output shows correct Chinese char, no error message.

So, I'm really confused. What does the "use utf8" pragma actually do in Perl 5.8.8? Why do I get the correct character showing on output even when I get the "Wide character in output..." message?

--- Marais