in reply to Re^2: high speed checksum for video finger printing?
in thread high speed checksum for video finger printing?

if no one has anything similar to this already.

Nothing I've seen, so go for it.

My suggestion would be to use Math::Random::MT as the PRNG. It is portable and reproducible cross-platform.

Then something like:

use Math::Random::MT qw[ rand srand ]; use Digest::CRC qw[ crc64 ]; sub fingerPrintFile{ my $file = shift; my $filesize = -s( $file ); srand $filesize; open my $fh, "<', $file or die $!; ## assuming CRC-64 my $chunks = int( $filesize / 8 ) - 1; ## Added sort per RichardK's suggestion below. my @posns = sort{ $a <=> $b } map 8*int( rand $chunks ), 1 .. 100; my $rawSample = join '', map{ seek $fh, $_, 0; read( $fh, my $chun +k, 8 ); $chunk } @posns; close $fh; return crc64( $rawSample ); }

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^4: high speed checksum for video finger printing?
by RichardK (Parson) on Feb 05, 2012 at 13:51 UTC

    I think I would sort the chunk positions first, then you would read the file in only one direction. As you've only got a small number of blocks the sort won't be costly, and then two blocks close together may fall in the same read ahead window.

    It just might improve performance on some file systems/OSes

      I think I would sort the chunk positions first, then you would read the file in only one direction.

      Agreed. That would be a simple and effective optimisation. I'll add it above.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?