in reply to Pulling by regex II

Just a few tricks. Don't use chained if statements when a hash will do:
my %month = ( Jan => 0, Feb => 1, ... ); # ... $month = $month{$month}; # Could stand to use better names
In many cases it's often better to make a big regex for the entire line. A fully-expanded line-matching regex can break out all the variables you need without any leftover material that needs to be stripped off with substitutions.

Alternatively, you could use something like Apache::ParseLog.

Also, since you've left out the brackets on your open call, it doesn't actually error out when you expect it to. The correct way would be:
open(LOGFILE, "datafile.html") || die "Can't open file";
Further, you can actually iterate over the log file one line at a time instead of reading it all in:
foreach my $log_line (<LOGFILE>) { # ... }
By the way, that commented out #use strict; is scary. The reason you're getting errors is because you're not properly declaring your variables with my. For example:
my $hour = param ("hour"); my $minute = param ("minute");

Replies are listed 'Best First'.
Re: Re: Pulling by regex II
by JayBonci (Curate) on Dec 14, 2002 at 06:49 UTC
    Alternately, you could build a month => month_num hash with a one-line map statement ala:
    my %months = map { (qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec +/)[$_] => $_}(0..11);
    I like doing stuff like this because it saves you the possible typo(s) if you userstand how map works. I'm always looking for more efficient ways to do it, however. Can anyone think of any?

        --jb

      Why use an array and map and a hash. A string and index will do it.

      my $month_num = index( 'Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov De +c', $month_str)/4;

      Examine what is said, not who speaks.

        I'd want to get a hash out because of efficiency and grace if I were to call it multiple times. I wouldn't want to slug around that entire index function call invokation every time I wanted to lookup a month_name to month_num transform.

            --jb
        my $month_num = index('JanFebMarAprMayJunJulAugSepOctNovDec', $month_str)/3;
        Now we're talking.
Re: Re: Pulling by regex II
by mkent (Acolyte) on Dec 14, 2002 at 22:51 UTC
    Thanks, I totally missed the error problem, thanks for pointing it out. Would going through the log one line at a time be more efficient?
      If you have a 2GB log file, you can imagine that reading the whole thing into memory isn't going to be terribly efficient. Processing it one line at a time is virtually a must.

      Unless you need to read in the whole file because it's not terribly large and you will be refering to it several times, it is almost always better to pick through it one line at a time.