http://qs1969.pair.com?node_id=831921


in reply to Re: Why is utf8 flag set after Encode::decode of pure ASCII?
in thread Why is utf8 flag set after Encode::decode of pure ASCII?

Many people give such advice: ignore perl's internal encoding. Fine advice, at least in production. But Perl makes so many magic behind-the-scenes Unicode conversions, one often needs to look at this flag in order to understand what end is up during development. Grr.

Replies are listed 'Best First'.
Re^3: Why is utf8 flag set after Encode::decode of pure ASCII?
by creamygoodness (Curate) on Mar 30, 2010 at 19:00 UTC

    It's not just Perl -- it's also CPAN modules, particularly XS modules. And I agree -- it's foolhardy to pretend that the SVf_UTF8 flag doesn't exist. It's almost impossible to troubleshoot UTF-8 problems in a large system without snooping it. The system is prone to silent failure, and when something goes wrong and you need to track down where the silent failure originates, you need to look at that flag.

    (Die $YAML::Syck::ImplicitUnicode, die die die.)

Re^3: Why is utf8 flag set after Encode::decode of pure ASCII?
by Joost (Canon) on Apr 01, 2010 at 00:36 UTC
    Agreed, but it's the only sane advice you will get.

    The reasons you have to look at the utf8 flag sometimes is because some of the code (mostly CPAN modules) do not use the provided sane advice.

    If you want to read/write text in a portable manner, or convert between text and binary (integer) representation of characters, you have to specify what encoding you're expecting. If you don't, your code will only reliably work on 7bit ASCII text. And that'll only work on most platforms. That's the executive summary, and that's really all there is to it.