John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Consider the following program:
use strict; use warnings; use utf8; use Tk; ${^WIDE_SYSTEM_CALLS}= 1; my $s= "This is a TM symbol => '\x{2122}',\nU+2122"; my $MW = new MainWindow; my $hello = $MW->Button( -text => $s, -command => sub {print STDOUT $s; } ); $hello->pack; MainLoop;
Running this under Windows 2000, the character is not shown on the button (nor is a square default character indicating that it's not in the font—and this character should be in all normal Windows fonts). Rather, I see three characters â (E3), unprintable, and ¢ (A2). Clearly it is interpreting the UTF-8 encoding as 3 8-bit characters instead.

Is Tk simply UTF8 hostile? If the string for the button is going through the normal mechanism that the core uses, it would convert to UTF-16 and call the Wide form of TextOut (or SetText or whatever) upon seeing the WIDE flag set.

Is it possible to use Unicode with Tk with the right incantations? Any plans for the future?

—John

Replies are listed 'Best First'.
Re: Unicode in Tk
by bastard (Hermit) on Jul 03, 2001 at 04:14 UTC
      My copy of Tk is freshly downloaded from ActiveState's PPM. The .pm file is dated 29-June-2000 (almost exactly one year old) and says it's $Tk::VERSION= '800.022'.

      I'm familiar with Perl's issues with UTF8, and reported many bugs myself. In my example program, the \x{xxxx} construct is used, which does work (puts the corresponding UTF8 into the string's representation) and forces the object into character-orientation (though I didn't check for sure in this test program; I know that from other utf8 experience and bugs in functions that don't respect it properly).

      So what is that “latest” version, and is it working on Win32 platform?

      —John

        Facing the same issue as of 10/2002 I came upon your post. Then on comp.lang.perl.tk found the following post from Tk's author (hope such cross-posting is okay.)
        Brig


        From: Nick Ing-Simmons (nick@ing-simmons.net)
        Subject: Win32 & Unicode
        Newsgroups: comp.lang.perl.tk
        Date: 2002-10-03 12:18:30 PST

        Just to let you know Tk804.??? has just displayed its 1st widget on NT4/SP6 with perl5.8.0. There are still a pile of issues to work through but I hope I have broken the back of the Unicode port to Windows. A question for Sarathy (or anyone else that knows) - Win32's wide char is 16-bits - it is obviously "little endian" but is it UCS-2 or UTF-16 i.e. does it have surrogates or is Win32 limited to U+FFFF ?

        Nick Ing-Simmons
        http://www.ni-s.u-net.com/

        -++**==--++**==--++**==--++**==--++**==--++**==--++**==
        This message was posted through the Stanford campus mailing list server. If you wish to unsubscribe from this mailing list, send the message body of "unsubscribe ptk" to majordomo@lists.stanford.edu