CrashBlossom has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

Back with another unicode question.

I am running strawberry perl 5.30 under windows 11.

I want to be able to use a tk entry widget to accept and display strings containing unicode characters. Here is an example:

use strict; use warnings; use Tk; # displayed in entry field as: "09 Handel_ Water Music Suite - Bouré. +m4a" my $str = "09 Handel_ Water Music Suite - Bouré.m4a"; my $mw = MainWindow->new(-title => " Unicode Test"); $mw->Entry(-textvariable => \$str, -width => 50, -font => "{Lucida Fax} 12 bold", )->pack(qw(-side left -expand 1 -fill x)); MainLoop();

As indicated in the comment, the string is not displayed correctly in the entry field. Does anyone know how to fix this?

Replies are listed 'Best First'.
Re: How to display/accept unicode chars in a tk entry widget
by kcott (Archbishop) on Jul 01, 2023 at 03:35 UTC

    G'day CrashBlossom,

    Whenever your source code contains Unicode characters (i.e. outside the 7-bit ASCII range) you should use the utf8 pragma.

    Running your code as posted, I got é where é should have been.

    Adding "use utf8;" after "use warnings;" fixed this.

    — Ken

      Problem solved - thanks!

Re: How to display/accept unicode chars in a tk entry widget
by soonix (Chancellor) on Jul 01, 2023 at 13:01 UTC

    I prefer charnames over utf8.

    Sure, the latter allows you to use characters from the extended character set directly in your source code, even in function or variable names.

    However, when working in a mixed environment, or collaborating/sharing over different architectures, many editors (and viewers) have different display settings. Most of these don't understand the in-band signalling that is "use utf8", especially if web viewers or Perlmonks come into the mix.

    Of course, if all is correctly configured,
    use utf8; my $str = "09 Handel_ Water Music Suite - Bouré.m4a";
    is easier to read than
    # use charnames if your Perl is older than 5.16 my $str = "09 Handel_ Water Music Suite - Bour\N{LATIN SMALL LETTER E +WITH ACUTE}.m4a";
    Tis is also helpful for languages where your coworker or maintainer is unfamiliar with. For me,
    • "मुक्ता" is more or less unreadable for me, because I don't know the letters,
    • "मुक्ता" makes me at least a bit confident that no bits were mangled,
    • but "\N{DEVANAGARI LETTER MA}\N{DEVANAGARI VOWEL SIGN U}\N{DEVANAGARI LETTER KA}\N{DEVANAGARI SIGN VIRAMA}\N{DEVANAGARI LETTER TA}\N{DEVANAGARI VOWEL SIGN AA}" gives me at least an idea what it is 😉
    Of course, YMMV, as always.

      Thanks for your response

      I use charnames when I create the string myself, but in this case the string was a filename which was read from a directory.

        The pragma utf8 fixes your example, but I would not expect it to help with a file name that you read. In that case, you probably have to decode the file name.
        Bill