aaAzhyd has asked for the wisdom of the Perl Monks concerning the following question:

LOG is an Apache log file, and this thing tends to choke on anything bigger than 20 megs. It's a simple script to spit out the unique visits and the refer ... Any suggestions?
my %ips; for (<LOG>) { $_ =~ /(.*)\s-\s-\s(\[.*?\])\s\"(.*?)\"\s(\S+)\s(\S+)\s\"(.*?) +\"\s\"(.*)\"/o; if (! $ips{$1}) { $ips{$1} = 1; if ($6 !~ /-/) { print "$2 $1 $6\n"; } } }

Replies are listed 'Best First'.
Re: Big file, system no likey ..
by Enlil (Parson) on Jun 18, 2003 at 22:56 UTC
    Any suggestions?

    Changing the for loop to a while loop should help.

    update:arturo++ for catching my laziness in explaination

    should probably explain, why this is. If you use a for loop, you end up pulling in all the lines in the file in one fell swoop (creating the list for the loop), and then iterating over them, which as you have noticed okay for smaller files, but with larger files you are using a lot of memory, and for even larger files, a lot more than memory still, well you get the picture.

    On the other hand with the while loop you are only reading in one line at a time, so you don't have to have the huge file in memory all at once.

    -enlil

      that is the exact answer I was looking for, thank you :-D
Re: Big file, system no likey ..
by artist (Parson) on Jun 18, 2003 at 23:04 UTC
Re: Big file, system no likey ..
by BrowserUk (Patriarch) on Jun 18, 2003 at 22:42 UTC

    ...tends to choke...

    Have you tried the Heimlich manover?

    Update:I apologise...Bad joke. This post deserves to be --'d.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller