in reply to UTF-8 trouble moving from perl 5.8.5 to 5.10.1

Some debugging tells me that my raw strings (which come from a variety of sources) don't have the utf8-flag set. When I run them through Encode::decode_utf8 or utf8::decode, I get broken UTF8. When I do a utf8::upgrade, they come out just fine.

Which means they are in stored in Latin-1, not UTF-8. The proper solution is to run them through Encode::decode('ISO-8859-1', $yourstring), or to recode them to UTF-8 in the storage location.

Replies are listed 'Best First'.
Re^2: UTF-8 trouble moving from perl 5.8.5 to 5.10.1
by manni (Novice) on Sep 11, 2011 at 15:14 UTC

    Thank you all for your input

    After a little more debugging, I now have found the silver bullet and it seems that all that was missing was a single line.

    We use Encode::decode everywhere we should, we're fine in that department. But we never told Perl how we would like our output.

    All that was missing was:

    binmode STDOUT, ':utf8';

    I guess the question was not why Unicode was broken on the new system, but rather why it worked on the old one.

      I guess the question was not why Unicode was broken on the new system, but rather why it worked on the old one.

      probably locale related, ie export LC_CTYPE=de_DE.UTF-8 or some such

      or set via perlrun#-C