> does that help to clarify things in terms of how the locale, utf-8 flag, bytes pragma etc. affect it all?
hmm ... I'm still confused. It seems lc works well while uc wasn't updated yet. Which is counterintuitive.
use strict;
use warnings;
use utf8;
use open qw(:std :utf8);
$\="\n";
print "Perlversion $]";
my $SS = "\x{1E9E}";
no locale;
print "=== local off LANG=$ENV{LANG}";
print "* TEST UC";
print "$_ -> ",ord($_) for "ß", "\Uß", uc("ß");
print "* TEST LC";
print "$_ -> ",ord($_) for $SS, "\L$SS", lc($SS);
use locale;
print "=== local on LANG=$ENV{LANG}";
print "* TEST UC";
print "$_ -> ",ord($_) for "ß", "\Uß", uc("ß");
print "* TEST LC";
print "$_ -> ",ord($_) for $SS, "\L$SS", lc($SS);
Can't do lc("\x{1E9E}") on non-UTF-8 locale; resolved to "\x{1E9E}". a
+t d:/tmp/job/eszet.pl line 33.
Can't do lc("\x{1E9E}") on non-UTF-8 locale; resolved to "\x{1E9E}". a
+t d:/tmp/job/eszet.pl line 33.
Perlversion 5.032001
=== local off LANG=DEU
* TEST UC
ß -> 223
SS -> 83
SS -> 83
* TEST LC
ẞ -> 7838
ß -> 223
ß -> 223
=== local on LANG=DEU
* TEST UC
ß -> 223
ß -> 223
ß -> 223
* TEST LC
ẞ -> 7838
ẞ -> 7838
ẞ -> 7838
NB: the warnings happen only when local is used. Which deactivates all conversion here.
Furthermore is ẞ a display problem of the monastery's code blocks, the character prints well inside my emacs.
update
I suppose Perl follows "unicode rules", but those haven't been updated yet to new "German rules" ... |