in reply to searching for a binary string in a binary file

Since you're looking for the offset of an exact match, the index function is ideal for you. There's no need to use an escaped regex (which could be done with /\Q$letter\E/).

Also, with a binary file, reading with the diamond op is chancy. You don't know where the audio may have a crlf pair. You clearly want to slurp the whole file into $word, so undefine $/ to make diamond do that.

my $word = do { local $/; <WORDSOUND> }; my $offset = index $word, $letter;
If you need more sophisticated analysis, take a look at PDL.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re^2: searching for a binary string in a binary file
by thor (Priest) on Dec 12, 2004 at 06:24 UTC
    You clearly want to slurp the whole file
    What if the file is hundreds of megs? Here's a little something I whipped up:
    use strict; use warnings; my $buf_size = 16_384; open(my $big, shift) or die; open(my $small, shift) or die; my $search_string; { local $/; $search_string = <$small>; } my $buffer = ""; my $pos = 0; while(sysread($big, $buffer, $buf_size, length($buffer) ) ) { if ( (my $index = index($buffer, $search_string)) != -1) { print "FOUND! found the search string at position ", $pos + $ind +ex; exit; } $buffer = substr($buffer, int(length($buffer)/2)); $pos += length($buffer); } print "search string not found";

    I'll grant that this also falls prey to the same problem that your has if the smaller file is itself large, but I think that in general this scales much better.

    thor

    Feel the white light, the light within
    Be your own disciple, fan the sparks of will
    For all of us waiting, your kingdom will come

    f
Re^2: searching for a binary string in a binary file
by rochlin (Acolyte) on Dec 12, 2004 at 02:18 UTC