in reply to Re^2: How to set the UTF8 flag?
in thread How to set the UTF8 flag?
Many—including the OP, apparently—assume it indicates whether the characters[1] of the string are Code Points or bytes. It does not.
It's a bit that indicates the internal storage format of the string.
When 0, the string is stored in the "downgraded" format.
The characters are stored as an array of C char objects.
When 1, the string is stored in the "upgraded" format.
The characters—whatever they may be—are encoded using utf8 (not UTF-8).
Being internal, you have no reason to access it unless debugging an XS module (which must deal with the two formats) or Perl itself. In such cases, you can use aforementioned utf8::is_utf8 or Devel::Peek's Dump. C code has access to the similar SvUTF8 and sv_dump.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: How to set the UTF8 flag?
by harangzsolt33 (Deacon) on Aug 19, 2025 at 19:35 UTC | |
by ikegami (Patriarch) on Aug 19, 2025 at 21:13 UTC | |
by harangzsolt33 (Deacon) on Aug 20, 2025 at 01:14 UTC | |
by GrandFather (Saint) on Aug 20, 2025 at 04:19 UTC | |
by harangzsolt33 (Deacon) on Aug 21, 2025 at 04:38 UTC | |
by ikegami (Patriarch) on Aug 20, 2025 at 03:05 UTC |