Just a one-shot, throwaway script I quickly concocted up yesterday, which I... don't feel like throwing away altogether: perhaps it will be instructive for others, although I warn in advance that for generic use it would require quite some massaging.

The problem: two big files (of the same actual size) which would definitely seem to be the same file, but indeed are not, as a checksum on each shows. I want to show the actual differences:

#!/usr/bin/perl use strict; use warnings; $|++; die "Usage: $0 file1 file2\n" unless @ARGV==2; my ($f1, $f2) = map { open my $fh, '<:raw', $_ or die "Can't open `$_': $!\n"; $fh } @ARGV; $/ = \0x100_000; while (my $s1=<$f1>) { defined +(my $s2=<$f2>) or last; printf "Block %04d: no differences\r", $. and next if $s1 eq $s2; my @l = map length, ($s1 ^ $s2) =~ /^(\0*)(.+?)(\0*)\z/; printf "\nBlock %04d: A=[0 x %d], B=[* x %d], C=[0 x %d]\n", $., @ +l; print "\tB=[@{[ unpack '(H2)*' => substr $_, $l[0], $l[1] ]}]\n" for $s1, $s2; } __END__

Funny detail, not only did the two files ended up differing by one single byte, but more precisely, by one single bit:

C:\temp>perl compare.pl foo bar Block 0727: no differences Block 0728: A=[0 x 759818], B=[* x 1], C=[0 x 288757] B=[e7] B=[ef] Block 0973: no differences
--
If you can't understand the incipit, then please check the IPB Campaign.

In reply to Finding differences in binary files by blazar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.