Thanks Roboticus.
I think I might have been unclear - what I want to avoid is reading line-per-line, as my code is very slow. I assume that's because of the while (my $buff = $z->getline()) loop, but feel free to correct me on this. Using this structure, my program takes a solid minute to run, whereas the shell script that does the same thing completes in a second or two.
Maybe I could call bzcat from the system, and store its output in a variable? But I'm still not sure how to use while (<>) inside a full Perl program, when I'm not reading in from a pipe.
Cheers,
JW.
Comment on Re^2: Searching large files a block at a time
Bunzip2's getline works just like <>; you can set $/ = "\n\n" to read in paragraph mode. It doesn't seem to be all that well optimized, though. You might try this:
open my $BZ, "bzcat $file |";
while (<$BZ>) { ... }