in reply to regular expression searching in binary files

Looks like the strings you are trying to match are utf-16, but burried in a binary file. I'd recommend you use binmode on the file handle you are using to read the data and then you can:

use warnings; use strict; use Encode; my $binstr = "\x{00}\x{01}\x{02}\x{03}\x{04}\x{05}" . "\x{00}A\x{00}u\x{00}t\x{00}h\x{00}o\x{00}r\x{00}" . "\x{80}\x{90}\x{a0}\x{b0}\x{c0}\x{d0}\x{e0}"; my $matchStr = encode ('utf16be', 'Author'); if ($binstr =~ /(\Q$matchStr\E)/) { my $match = decode ('utf16be', $1); print "Found $match\n"; }

Prints:

Found Author

Note that this assumes big endien which seems to match your example, but could be little endien which is native for Windows systems and normal for the net.


DWIM is Perl's answer to Gödel