in reply to Perl strings questions

I don't see much mention here of Perl's UTF-flag, even though it is discussed in the perldoc for Encode. The essence of UTF-encoding is that, if(!) you know to treat the string as "UTF-encoded," it provides a way to encode Unicode code-points (characters ...) in a byte-stream. But Perl is much older than UTF, so it might encounter what are intended to be byte-streams which coincidentally contain "UTF indicator" bytes. Perl implemented a hidden flag to indicate whether eq should or should not use Unicode-aware comparisons against the values.

Replies are listed 'Best First'.
Re^2: Perl strings questions
by choroba (Cardinal) on Jun 03, 2021 at 14:47 UTC
    Using the flag in Perl code is a code smell. You can set the flag on any string, and you can clear it on any string. The flag doesn't know where the value comes from and what encoding it originally used. The function is_utf8 is also named incorrectly, as it in fact tells you whether the value uses wide characters representation internally. See #131685 for a related discussion.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re^2: Perl strings questions
by Your Mother (Archbishop) on Jun 03, 2021 at 01:22 UTC

    Side-note for the side-show: Perl and Unicode were both born in 1987.