You should probably avoid treating the values from your files as numbers.
It is possible, even likely, that the conversion of these values into Perl's internal numeric format would change their values. For example, they may have been truncated (or rounded using one of several different rounding algorithms), from single precision floats. If you then re-interpret them into double precision floats, you will introduce differences that are not there in the original files, or discard differences that are there.
This treats the fields as strings (as human eyes do) until the final decision about the last digits, where they are compared as (integer) numerics. It also takes every opportunity to bail out as early as a difference is found.
#! perl -slw
use strict;
die "Files differ in length"
unless -s( $ARGV[0] ) == -s( $ARGV[ 0 ] );
open FH1, '<', $ARGV[0] or die $!;
open FH2, '<', $ARGV[1] or die $!;
#my $mismatch = 0;
until( eof( FH1 ) || eof( FH2 ) ) {
my $line1 = <FH1>;
my $line2 = <FH2>;
next if $line1 eq $line2;
my @line1 = split ' ', $line1;
my @line2 = split ' ', $line2;
for ( 0 .. $#line1 ) {
next if $line1[ $_ ] eq $line2[ $_ ];
next if abs( chop( $line1[ $_ ] ) - chop( $line2[ $_ ] ) ) < 2
and $line1[ $_ ] eq $line2[ $_ ];
die "Files differ at line: $. field: $_\n";
#$mismatch = 1;
}
}
#die "File are different\n" if $mismatch;
die "Files have different numbers of lines\n"
unless eof( FH1 ) and eof( FH2 );
print "Files are the same\n"; ### Or "files are sufficiently similar"
close FH1;
close FH2;
Change the second die to warn and uncomment the related mismatch paraphernalia if you want a comprehensive list of the differences.
Output for test files:
C:\test>828506 file1 file2
Files are the same
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|