Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I run a cron job to update a graph which charts hits to my website. I just do this by running throught the apache log. The file tends to get quite big. Right now I generate the entire graph every time I run the script. This tends to eat up quite a few CPU cycles. What I'd rather do is just jump to the point I had last read the file up to and then just parse the new accesses. Any suggestions for how I'd do this?
  • Comment on How do I jump to the point I last checked a file at?

Replies are listed 'Best First'.
Re: How do I jump to the point I last checked a file at?
by Anonymous Monk on Feb 03, 2000 at 13:33 UTC
    Not Perl related, but...
    Just rotate the logs daily, weekly or monthly...
    you can keep the separate files if you dont want to
    erease them, and you can have a separate file where
    you save the results up to the last time you ran the
    program.
    BTW, webalizer doest that already, search freshmeat.
Re: How do I jump to the point I last checked a file at?
by Anonymous Monk on Feb 03, 2000 at 18:42 UTC
    Rather than write a daemon, just remember the seek position of the end of the last line you processed. Store this somewhere sensible and then start from there. Something like (untested, no error checking, yadda yadda)
    $log = "foo.log"; $pos_file = $log . ".pos"; $pos = 0; if( open POS, "<$pos_file" ) { $pos = <POS>; close POS; } open FILE, $log; seek FILE, $pos, 0; ...process file until eof... $pos = tell FILE; open POS, ">$pos_file"; print POS $pos; close POS;
    You might need to be careful with big files (>2 or 4Gb) and whatever method you use to rotate/backup log files needs to be aware of your seek position too.
Re: How do I jump to the point I last checked a file at?
by nate (Monk) on Feb 03, 2000 at 03:41 UTC
    the problem with apache logs is that they aren't a fixed length record within the file. This makes going to a specific record much more difficult.

    One thing you could do is use a daemon, rather than a cron job to read from the file as apache writes to it (ala 'tail -f'). That way, you could change your statistics periodically, rather than having to mow through the file every n minutes.

    Or use analog...

Re: How do I jump to the point I last checked a file at?
by dlc (Acolyte) on Feb 03, 2000 at 19:18 UTC

    i use a separate mysql server specifically for logging hits to my Apache/mod_perl, with a custom PerlLogHandler. That way, creating graphs and such is as simple as a SQL select.