in reply to uc and German eszett "ß"

That is the very character used as an example in the doc for lc - does that help to clarify things in terms of how the locale, utf-8 flag, bytes pragma etc. affect it all?


🦛

Replies are listed 'Best First'.
Re^2: uc and German eszett "ß"
by LanX (Saint) on Feb 01, 2022 at 21:19 UTC
    > does that help to clarify things in terms of how the locale, utf-8 flag, bytes pragma etc. affect it all?

    hmm ... I'm still confused. It seems lc works well while uc wasn't updated yet. Which is counterintuitive.

    use strict; use warnings; use utf8; use open qw(:std :utf8); $\="\n"; print "Perlversion $]"; my $SS = "\x{1E9E}"; no locale; print "=== local off LANG=$ENV{LANG}"; print "* TEST UC"; print "$_ -> ",ord($_) for "ß", "\Uß", uc("ß"); print "* TEST LC"; print "$_ -> ",ord($_) for $SS, "\L$SS", lc($SS); use locale; print "=== local on LANG=$ENV{LANG}"; print "* TEST UC"; print "$_ -> ",ord($_) for "ß", "\Uß", uc("ß"); print "* TEST LC"; print "$_ -> ",ord($_) for $SS, "\L$SS", lc($SS);

    Can't do lc("\x{1E9E}") on non-UTF-8 locale; resolved to "\x{1E9E}". a +t d:/tmp/job/eszet.pl line 33. Can't do lc("\x{1E9E}") on non-UTF-8 locale; resolved to "\x{1E9E}". a +t d:/tmp/job/eszet.pl line 33. Perlversion 5.032001 === local off LANG=DEU * TEST UC ß -> 223 SS -> 83 SS -> 83 * TEST LC ẞ -> 7838 ß -> 223 ß -> 223 === local on LANG=DEU * TEST UC ß -> 223 ß -> 223 ß -> 223 * TEST LC ẞ -> 7838 ẞ -> 7838 ẞ -> 7838

    NB: the warnings happen only when local is used. Which deactivates all conversion here.

    Furthermore is ẞ a display problem of the monastery's code blocks, the character prints well inside my emacs.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

    update

    I suppose Perl follows "unicode rules", but those haven't been updated yet to new "German rules" ...