Re: different utf8 method = different behaviour?

If you use binmode, the problem goes away. For example:

#!/usr/bin/perl

use strict;
use warnings;

binmode STDOUT, ':encoding(utf8)';

my $file = '/root/Desktop/russian';
open FILE, "<:utf8", $file or die $!;
my (@data1) = <FILE>;
close(FILE);

use open(':encoding(utf8)');
open( FILE, $file ) or die "can not open $file";
my (@data2) = <FILE>;
close(FILE);

die "different size" if scalar @data1 != scalar @data2;
while (@data1) {
    my $s1 = shift @data1;
    my $s2 = shift @data2;
    print "1: $s1\n2: $s2\n";
    die "different data" if $s1 ne $s2;
}
[download]

Comment on Re: different utf8 method = different behaviour? Download Code

Replies are listed 'Best First'.
Re^2: different utf8 method = different behaviour? by erwan (Sexton) on May 01, 2011 at 14:45 UTC
Actually... no. with my data I still have the same output (the program still dies with the "different data" message).	[reply]
Re^3: different utf8 method = different behaviour? by roboticus (Chancellor) on May 01, 2011 at 15:08 UTC
erwan: What's the difference between @data1 & @data2? Perhaps comparing the hexdump of the values may yield a clue or two. ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply]
Re^4: different utf8 method = different behaviour? by erwan (Sexton) on May 01, 2011 at 16:37 UTC
Thanks for the idea but in this case that's not useful: I know that my data is dirty, the goal is not correct it but to understand why the result was different depending on the opening method.	[reply]