Re: Need help with binary mode and UTF-8 characters

Peeking at the text before you decode it doesn't make much sense.

Calling chomp twice makes no sense.

Calling chomp at all doesn't make sense since if you don't add back a newline.

That said, neither of those problems should give you garbage. Maybe the file wasn't in UTF-8? Maybe you're viewer ins't treating the file as UTF-8? Maybe the problem is in how you open IN?

Cleaned up code:

open(my $fh_in,  '<:encoding(UTF-8)', 'C:\\data.in") or die $!;
open(my $fh_out, '>:encoding(UTF-8)', 'C:\\data.out") or die $!;
while (<$fh_in>) {
    chomp;
    # ...
    print $fh_out "$_\n";
}
[download]

Comment on Re: Need help with binary mode and UTF-8 characters Select or Download Code

Replies are listed 'Best First'.
Re^2: Need help with binary mode and UTF-8 characters by philrennert1 (Novice) on Dec 10, 2009 at 14:08 UTC
Thanks. Yes, chomping twice didn't make sense. The point was to string a lot of lines together into one, so chomping once and then adding \n at the end did. Yes, I needed to add `binmode(IN, ':encoding(UTF-8)'); and binmode(OUT, ':encoding(UTF-8)');` [download]	[reply] [d/l]