comment on

The answer to question 1 is easy. Load both files as scalars and eq will tell you if the are identical.

#! perl -slw
use strict;

die "usage: $0 binfile1 binfile2" unless @ARGV == 2;

open my $f1, '< :raw', $ARGV[ 0 ] or die "Couldn't open $ARGV[ 0 ]: $!
+";
open my $f2, '< :raw', $ARGV[ 1 ] or die "Couldn't open $ARGV[ 1 ]: $!
+";

my( $d1, $d2 );

sysread( $f1, $d1, -s $ARGV[ 0 ] ) or die "Couldn't read $ARGV[ 0 ]";
sysread( $f2, $d2, -s $ARGV[ 1 ] ) or die "Couldn't read $ARGV[ 1 ]";

close( $f1 ) and close( $f2 );

print "$ARGV[ 0 ] and $ARGV[ 1 ] are ", $d1 eq $d2 ? 'the same' : 'dif
+ferent';

__END__
P:\test>320353 fox1.jpg fox1.jpg
fox1.jpg and fox1.jpg are the same

P:\test>320353 fox1.jpg fox2.jpg
fox1.jpg and fox2.jpg are different
[download]

The answer to question 2 is either relatively trivial, just requiring large amounts of processor power, or much, much harder, depending upon whether the registration between the two images are accurate.

If the two images are accurately aligned, then you could load the images using GD and that will allow you to perform your distance algorithm quite easily (if rather slowly).

If the two images are even 1-pixel out of alignment, and the problem has become 9x harder (and slower). If you are going to allow for the images being 2-pixels out of alignment and it gets 25x harder, 3-pixels and 49x harder, and so on.

If you intend to do this in perl, then you would probably be better off converting the jpgs to a raw file format, no headers, compression etc. just 3 (or 4 ) bytes per pixel in a contiguous stream and the loading them up and using something like pdl which will allow you to perform the math in C.

Have fun:

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Timing (and a little luck) are everything!

In reply to Re: Matching Binary Files by BrowserUk
in thread Matching Binary Files by Itatsumaki

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.