in reply to Re^3: Seeking Perl docs about how UTF8 flag propagates
in thread Seeking Perl docs about how UTF8 flag propagates
Not sure about lc(), but here's another case where the closely-related uc() behaves differently:
$ascii = "\x{df}"; chop($utfer = "\x{100}"); $utf = $ascii . $utfer; print uc($_) for ($ascii, $utf);
As a Unicode codepoint, "\x{df}" is interpreted as the lowercase German "es-zed" character (ß), which uppercases to "SS". As an ASCII codepoint it is seen as a non-word character, and does not change.
This is a rare case where changing the case of a string also changes its length.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Seeking Perl docs about how UTF8 flag propagates
by hippo (Archbishop) on May 17, 2023 at 06:46 UTC | |
by haj (Vicar) on May 17, 2023 at 06:51 UTC | |
|
Re^5: Seeking Perl docs about how UTF8 flag propagates
by haj (Vicar) on May 17, 2023 at 06:34 UTC |