cr8josh has asked for the wisdom of the Perl Monks concerning the following question:

Greetings

I am using Perl and Tk to read text from a .txt file (Dos/Windows ANSI) and display it, edit it and save it back to a .txt file. The file is in French, and contains the character œ. The œ will not show up in the TK window (it disappears), but when I save the text back out, it shows up in the saved txt file. So, it seems to be there, just not be displayed. I wonder if it may be a font issue (?) but I've tried many standard fonts: courier, Arial, helvetica. I am already changing the system's Locale to French. Any thoughts? I want to avoid Unicode if possible.

Thanks in advance

Here's some sample code:

use warnings; use strict; use Tk; use Tk::TextUndo; my $main = MainWindow->new; my $middle = $main->Frame(-borderwidth=>2,-relief=>'groove')->grid(-ro +w=>0,-column=>1,-sticky=>'nsew'); my $middlebox = $middle->TextUndo ( -font => 'arial 10', -wrap => 'word', -spacing2 => 10, -spacing3 => 30, -takefocus => 1, -background => 'white', -width => 70, -height => 10, )->grid(-row=>0, -column=>0, -sticky=>'nsew +'); $middlebox->insert('end',"This is a test of the œ character"); MainLoop;

Replies are listed 'Best First'.
Re: Perl TK character disappearingue
by zentara (Cardinal) on Jun 26, 2012 at 21:51 UTC
    Try putting "use utf8;" in your program, or show us a simple running code example.

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh

      Thanks! use UTF8 doesn't help, and I've added sample code to the original post...The output has an invisible character where the ASCII French character should be...

        Hi, I hit a temporary glitch, then this started working for me. I don't know what to tell you. I suspect you have file encoding mismatches somewhere. When I download your code example, the œ dosn't show up in my editor or Tk. I then manually copy&pasted in œ from your html into the Tk program and it ran. Maybe some unicode guru knows the reason?
        #!/usr/bin/perl use warnings; use strict; use Tk; use Tk::TextUndo; use utf8; my $main = MainWindow->new; my $middle = $main->Frame(-borderwidth=>2,-relief=>'groove')->grid(-ro +w=>0,-column=>1,-sticky=>'nsew'); my $middlebox = $middle->TextUndo ( -font => 'arial 20', -wrap => 'word', -spacing2 => 10, -spacing3 => 30, -takefocus => 1, -background => 'white', -width => 70, -height => 10, )->grid(-row=>0, -column=>0, -sticky=>'nsew +'); $middlebox->insert('end',"This is a test of the œ character\n"); print ord('œ'),"\n"; # prints 339 print chr(339),"\n"; $middlebox->insert('end', ord('œ') ); $middlebox->insert('end',"\n"); $middlebox->insert('end', chr(339) ); MainLoop;

        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh
Re: Perl TK character disappearing
by zentara (Cardinal) on Jun 27, 2012 at 19:52 UTC
    Hi, another thing to try, if you are reading in files with extended characters, is to use the Encode module to decode the input as utf8.
    use Encode; my $buf; open (my $fh, "< slurp1"); read( $fh, $buf, -s FH ); close $fh; my $file = decode('utf8', $buf); # now just insert $file into the Tk::Text widget

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh

      Thanks so much for the suggestion. I tried that, with my test code, and now it says "This is a test of the ? character", with the ? in a black diamond...!

      If I add "use UTF-8" at the top, I get an error when it hits that character saying "Malformed UTF-8 character (unexpected continuation byte 0x9c, with no preceding start byte) at..."

      Finally, if I paste in the character from the Perlmonks page to my editor (komodo), it doesn't fix it for me...

      Would love to hear any other thoughts!

      use Encode; use Tk; use Tk::TextUndo; my $main = MainWindow->new; my $string = decode ('utf8', "This is a test of the œ character"); my $middle = $main->Frame(-borderwidth=>2,-relief=>'groove')->grid(-ro +w=>0,-column=>1,-sticky=>'nsew'); my $middlebox = $middle->Text ( -font => 'system 10', -wrap => 'word', -spacing2 => 10, -spacing3 => 30, -takefocus => 1, -background => 'white', -width => 70, -height => 10, )->grid(-row=>0, -column=>0, -sticky=>'nsew +');#pack(-expand => 1, -fill => 'both');#; $middlebox->insert('end',"$string"); MainLoop;
        You say to perl your script is utf-8, but it is not. It contains 0x9c, which is cp1252 for "oe". So, use
        use encoding 'cp1252';
        instead, or save your file in utf-8.