Re: How to handle encoding for STDERR

Well, your question is very hard to understand. What exactly is your problem? Read perllocale. It states

By default, Perl ignores the current locale. The "use locale" pragma tells Perl to use the current locale for some operations

So, unless you say "use locale" your locale settings are ignored by perl program. But they are not ignored by the shell that was used to execute perl program. So, if the shell is configured to receive UTF-8 text from programs, then your perl program should produce it, otherwise you get garbage to see.

Now, you get garbage. First, you should figure out, what is the source for the garbage. You have code 'warn "<UTF-8 string>"', do I assume correctly, that in place of "<UTF-8 string>" you do have some text with UTF-8 characters? Is this string shows up correctly?

If only $! shows up as garbage, have you tried to check if it contains octets or has utf8 flag set? As far as I understand it, binmode configures filehandle to convert all data from internal encoding (marked by presence of utf8 flag) to the sequence of octets in appropriate encoding. So, if the data is already sequence of octets, then the additional conversion will mess up the data.

Personally, I avoid using binmode for setting UTF-8 handling. I just follow simple rule: output only octets in appropriate encoding (normally it is UTF-8). Then I just use Encode::decode or Encode::encode to convert octets to strings as perl understands them, or back from perl strings to octets for output.

If there's 'use utf8', then any strings directly provided in the script will be converted to internal format understandable by perl, so those will have to be converted to octets before they are passed outside of perl program.

I've never seen $! containing non-english text because most of the time the systems I work with don't have anything but English stuff, so I don't know in which form is the text there. But if it is just sequence of octets, then you'll have problem outputting it through file handle expecting perl string and not sequence of octets.

Comment on Re: How to handle encoding for STDERR

Replies are listed 'Best First'.

Re^2: How to handle encoding for STDERR
by na (Novice) on Feb 12, 2013 at 13:52 UTC

Sorry for poor writing.

As you guess, "<UTF-8 string>" is a utf-8 encoded string in Japanese and output fine( because of combination of "use 'utf8'" and "binmode(STDERR , ':utf8')." It may depend OS and locale, but for "ja_JP.UTF-8" locate on Ubuntu case, Perl generate language specific error messages.

I just want to know how to get error string in C-locale even if Perl-process start in non-C locale.

[reply]