cmv has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks-

I have users who are doing cut-n-paste into a Tk Entry widget in my perl app and causing some issues.

Apparently things like Outlook, Notepad, etc, can be configured to change an ASCII single quote (') into smart quotes aka Unicode Right Single Quote and Left Single Quote. When they cut-n-paste this into my entry widget, I see goofy things happen later in my script. I'm including a test script below, note the difference between what shows up in the message box and what the print STDERR outputs.

What is going on here, and what are some intelligent ways for me to handle it?

I think I want to simply translate any unicode single quotes (left or right) to the ASCII single quote - no? What if they start pasting other unicode characters? Looking for experience & guidance here.

Thanks

-Craig

#!/opt/homebrew/bin/perl use strict; use warnings; use Tk; my $main = new Tk::MainWindow(); my $entry_test = $main->Entry(-text => "single-quotes: ’'")->pack(); my $btn_test = $main->Button( -text => ' Test ', -command => sub { my $text = $entry_test- +>get(); &msg($main, $text) if ( +$text); } )->pack(); sub msg { my ($parent, $msg) = @_; print STDERR "Messaging: $msg\n"; $parent->messageBox( -title => 'Event', -message => $msg, -type => ' +ok', -icon => 'info' ); } MainLoop();

Replies are listed 'Best First'.
Re: Tk Entry & Right Single Quote
by choroba (Cardinal) on Apr 11, 2024 at 23:34 UTC
    You're using a non-ASCII character in the code. You should tell Perl about the encoding:
    use utf8;

    Also, you're printing the character to STDERR. Either set the binmode of STDERR to understand UTF-8, or encode the character.

    binmode STDERR, ':encoding(UTF-8)'; # or use Encode qw{ encode }; say STDERR encode('UTF-8', "Messaging: $msg");

    If you're on MSWin (guessing from words like Outlook and Notepad), you might need to switch the terminal's codepage to UTF-8. I don't have such a machine handy, so you'll have to Google that yourself.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      I did test on my Win10 machine, here is how to change the code page in a command window:
      C:...Documents\PerlProjects\Monks>chcp 65001 Active code page: 65001
      With that and the above suggestions, and making sure my editor saved the program text as UTF-8 instead of ASCII, I got the correct display from STDERR in the command window. I did not get the message box to display properly from the program variable but did get cut-n-paste Chinese characters into the Tk window to show up correctly in the message box.
      ++choroba

      You hit it on the head. The use utf8; did the trick, and everything worked as expected.

      I am running on MacOSX, so the terminal output was correct for me, but now I know how to fix that when needed.

      I clearly need to learn more about Perl's Unicode handling.

      Many thanks!

Re: Tk Entry & Right Single Quote
by NERDVANA (Priest) on Apr 12, 2024 at 07:22 UTC
      ++NERDVANA

      You gave me good advice for both my Make the problem go away... and Geez, I need to learn more about Unicode...

      I will be doing both, thanks for the pointers.

My Solution: Tk Entry & Right Single Quote
by cmv (Chaplain) on Apr 15, 2024 at 22:37 UTC
    Folks-

    As usual, the final solution was a bit more complicated than I thought, but I wanted to provide a final update in case others have the same issue.

    The attached code shows what I did in order to translate unicode characters into ASCII with Text::Unidecode. This will translate the entire string on each keystroke, and works with cut-n-paste as well.

    NOTE: when using a text variable for the text in the entry, you cannot simply re-assign text to it in the validatecommand() sub. One trick is to use afterIdle() to do that later.

    Thanks again to all for the help!

    Craig

    #!/opt/homebrew/bin/perl use strict; use warnings; use Data::Dumper; use Tk; use utf8; use Text::Unidecode; my $mw = MainWindow->new(); my $textvar; my $e = $mw->Entry(-textvariable => \$textvar, -validate => 'key', -validatecommand => sub { my ($new,$changed,$old,$ix,$type) = @_; return 1 if (!defined($changed)); return 1 if ($new eq "") or ($type<0); $mw->afterIdle(sub{ $textvar = unidecode($new); print STDERR "ASCII: $textvar\n"; }); return 1; }, )->pack(); MainLoop;
      Folks-

      One other thing...

      If you are trying to create a perl executable of the above script using par (with either pp or pp_autolink, the following will not work:

      pp_autolink -v -o squote.exe squote.pl

      This is because the translation tables are not being found and included. To fix this issue, use the following command:

      pp_autolink -v -M Text::Unidecode:: -o squote.exe squote.pl

      The trailing :: tells par to include both the file Text::Unidecode.pl and the directory Text::Unidecode which contains all of the necessary translation tables.

      I wanted to document this here as well, so when I forget later on - I'll have a chance of catching it again.

      -Craig