in reply to Need help with binary mode and UTF-8 characters

Have a gander at PerlUniTut

I suspect your problem is that you need to tell Perl that your input source is Unicode.

eg.
use Encode qw(encode decode); while ( my $readline = <$fh> ){ my $foo = decode('UTF-8', $readline); }
etc. I could be way off though...

Update:
Just realised... Unless you actually mean to change the encoding of your data from UTF-8 into something else, you'll need to re-encode it into UTF-8 after you've finished processing it in Perl.

use Encode qw(encode decode); while ( my $readline = <$fh> ){ my $foo = decode('UTF-8', $readline); #do stuff here $foo = encode('UTF-8', $foo); print {$output_fh} $foo; }

Replies are listed 'Best First'.
Re^2: Need help with binary mode and UTF-8 characters
by philrennert1 (Novice) on Dec 10, 2009 at 14:04 UTC
    Okay, thanks for the pointer: that tutorial did it. Yes, I wasn't telling Perl about the Unicode. Adding
    binmode(IN, ':encoding(UTF-8)'); and binmode(OUT, ':encoding(UTF-8)');
    immediately after opening these files did it.
      Actually, you were. The "decode" did the same thing as the first binmode.