Re^12: Seeking Perl docs about how UTF8 flag propagates (Terminology)

That's wrong.

It doesn't "default to ASCII". It works against decoded text aka string of Unicode Code Points. Always. This can be demonstrated using "\N{U+100}" =~ /\w/ (which matches). You need to use /a if to limit it to the ASCII range.

Text encoded using ASCII happens to work because $x eq encode( "US-ASCII", $x ).

Text encoded using iso-latin-1 happens to work because $x eq encode( "iso-latin-1", $x ) (though do see last paragraph).

Those are just side effects of \w working on decoded text.

There was a bug where \w didn't work for characters in U+0080..U+00FF sometimes. This was fixed 12 years ago in 2011. Add use v5.14; to get the fix.

Comment on Re^12: Seeking Perl docs about how UTF8 flag propagates (Terminology) Select or Download Code

Replies are listed 'Best First'.
Re^13: Seeking Perl docs about how UTF8 flag propagates (Terminology) by LanX (Saint) on May 22, 2023 at 20:50 UTC
> There was a bug where \w didn't work for characters in U+0080..U+00FF sometimes. This was fixed 12 years ago in 2011. Add use v5.14; to get the fix. I call default anything without pragmas. But we can agree that the default is buggy. Cheers Rolf _{(addicted to the 𐍀𐌴𐍂𐌻 Programming Language :) Wikisyntax for the Monastery}	[reply]
Re^14: Seeking Perl docs about how UTF8 flag propagates (Terminology) by ikegami (Patriarch) on May 23, 2023 at 01:27 UTC
It doesn't "default to ASCII". It works against decoded text aka string of Unicode Code Points. Always. Even without pragmas. This can be demonstrated using `"\N{U+100}" =~ /\w/` (which matches). You need to use `/a` if to limit it to the ASCII range.	[reply] [d/l] [select]
Re^15: Seeking Perl docs about how UTF8 flag propagates (Terminology) by LanX (Saint) on May 23, 2023 at 10:18 UTC
we were talking about encoded text without UTF8 flag, but ... `DB<3> p utf8::is_utf8("\N{U+100}") 1` [download] please lets stop it here. Cheers Rolf _{(addicted to the 𐍀𐌴𐍂𐌻 Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l]
Re^16: Seeking Perl docs about how UTF8 flag propagates (Terminology) by ikegami (Patriarch) on May 23, 2023 at 14:40 UTC