From: "John W. Krahn" <someone@example.com>
Message-ID: <43UOh.28203$x9.8743@edtnps89>
Date: Thu, 29 Mar 2007 19:02:24 GMT

Mumia W. wrote:
> On 03/29/2007 07:24 AM, cadetg@googlemail.com wrote:
>> Dear Perl Monks, I am developing at the moment a script which has to
>> parse 20GB files. The files I have to parse are some logfiles. My
>> problem is that it takes ages to parse the files. I am doing something
>> like this:
>>
>> my %lookingFor;
>> # keys => different name of one subset
>> # values => array of one subset
>>
>> my $fh = new FileHandle "< largeLogFile.log";
>> [1:] while (<$fh>) {
>>   foreach my $subset (keys %lookingFor) {
>>     foreach my $item (@{$subset}) {
>> [2:]      if (<$fh> =~ m/$item/) {
> 
> You are aware that line 2 reads in a new chunk from $fh, and the old
> chunk read on line 1 is forgotten, don't you?

It is the other way around.  "while (<$fh>) {" read a line and stores it in $_
so it is still around, while "if (<$fh> =~ m/$item/) {" reads another line and
binds it to the regular expression and then discards it.


>>         my $writeFh = new FileHandle ">> myout.log"; print $writeFh <
>> $fh>;

And a third line is read from the file and printed out.




John
-- 
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order.       -- Larry Wall