in reply to Re^12: Seeking Perl docs about how UTF8 flag propagates (Terminology)
in thread Seeking Perl docs about how UTF8 flag propagates

> There was a bug where \w didn't work for characters in U+0080..U+00FF sometimes. This was fixed 12 years ago in 2011. Add use v5.14; to get the fix.

I call default anything without pragmas. But we can agree that the default is buggy.

Cheers Rolf
(addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
Wikisyntax for the Monastery

  • Comment on Re^13: Seeking Perl docs about how UTF8 flag propagates (Terminology)

Replies are listed 'Best First'.
Re^14: Seeking Perl docs about how UTF8 flag propagates (Terminology)
by ikegami (Patriarch) on May 23, 2023 at 01:27 UTC

    It doesn't "default to ASCII". It works against decoded text aka string of Unicode Code Points. Always. Even without pragmas. This can be demonstrated using "\N{U+100}" =~ /\w/ (which matches). You need to use /a if to limit it to the ASCII range.

      we were talking about encoded text without UTF8 flag, but ...

      DB<3> p utf8::is_utf8("\N{U+100}") 1

      please lets stop it here.

      Cheers Rolf
      (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
      Wikisyntax for the Monastery

        we were talking about encoded text without UTF8 flag

        Absolutely not. You replied to "Terms describing what a string represents (unrelated to storage format)" (emphasis added).