in reply to Re^2: Binary Comparision
in thread Binary Comparision
There are no differences[*] between text and binary files except how you open them. Your plan would fail for text too. Consider trying to match "def\nghi" in a file whose content is "abcdef\nghijkl". You have the same problem whether the file is text (lines) or binary (blocks). The problem you really have is not text vs binary. If you solve this problem for text files, you also solve it for binary files.
If you know the length of the longest signature, you could use
my $longuest_sig_len = ...; my $block_size = 4096; $block_size = int(($longuest_sig_len + 1023) / 1024) if $block_size < $longuest_sig_len; local $/ = \$block_size; my $block = ''; while (<$fh>) { $block = substr($block, -($longuest_sig_len-1)) . $_; ... search for signature in $block ... }
That's the approach I'd take if I was looking for one string. There are surely algorithms that are more efficient at search for a number of strings.
* — You can even use while (<FILE>) on a binary file, but it might read more than you expect. Setting $/ to a reference to a number (e.g. $/ = \1024; and $block_size = 1024; $/ = \$block_size;) solves that.
|
|---|