in reply to utf8 char or binary string detection

How can I automatically determine, if the string is bytes and only then do the decode command?

What you're supposed to do, is fix the code that puts stuff into $val when it puts stuff into $val, not try to work around this later on, get it at the source

  • Comment on Re: utf8 char or binary string detection

Replies are listed 'Best First'.
Re^2: utf8 char or binary string detection
by Anonymous Monk on Nov 07, 2015 at 08:43 UTC
Re^2: utf8 char or binary string detection
by igoryonya (Pilgrim) on Nov 07, 2015 at 11:33 UTC
    Can't fix that. It's not an error in the program.
    When I get filenames, I have to use them in the byte representation, instead of utf8, because, conversion to utf8 can brake some filenames.
    Mostly, everything else needs to be in utf8.
    I use file system path modules in order to manipulate the dirs and files names, etc.
    When I print those path names to the screen, if they are not decoded, they show garbage in non-latin letters. Those modules, in certain cases, after manipulating the path names, keep strings in byte representation, but set the variable's utf8 flag on, which makes the variable contents, being represented in bytes with utf8 standard routine checking, thinking, that it's utf8 already.
    I have a subroutine, I've wrote for outputting stuff to stdout. I do not use print directly, because my subroutine handles everything automatically, so that I can use one program to use in CGI, terminal STDOUT and GUI without rewriting.
    I need a way in that subroutine to detect, if the variable, that it recieved is utf8 or a byte string.
    I've used to use: use encoding 'utf8', STDOUT => 'utf8';, and it worked automatically, but now, since perl's the version 5.20, or something, the encoding pragma is finally deprecated, so I have to think of an other way to solve this encoding issue.