Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a module that parses logs. I've got about 20 different line types. The idea is to stuff a hash with the regex ( qq{} ), the general name of the line, and some other trivial things.

Here is what I have right now:

elsif (/..a regex goes here../) { if(ref($self->{_ref_killed}) eq 'CODE'){ $self->{_ref_killed}->($1,$2,$3) } elsif(ref($self->{_ref_killed}) eq 'ARRAY'){ foreach my $sr (@{$self->{_ref_killed}}){$sr->($1,$2,$3)} } elsif(ref($self->{_ref_default}) eq 'CODE'){ $self->{_ref_default}->($1,$2,$3) } $self->_last_death($2); }
"killed" would be that code chunks "name" (from reference above.)

As with all log parsing, I'm racing for time. Some benchmarks showed a simple ++ operation with and without eval:

Without Eval: 0.059999942779541 With Eval: 4.15600001811981

So eval is out of the question. As I said above, I'd personally perfer to have a hash of arrays:

$type{killed} = [$regexref,3]; # 3 to pass $1,$2, and $3 #above is not exact and has problems I'm sure. Just for the idea.
If I can make this all a big maintainable hash, that would be far superior to having the same (generally) code repeated over and over.

Any ideas? Thanks,

Replies are listed 'Best First'.
Re: Speed vs Laziness
by merlyn (Sage) on Feb 27, 2001 at 07:21 UTC
    Make your basic data structure look something like this:
    my @actions = ( qr/thingy(with)parens(to)boot/ => sub { do this }, qr/smile(and)frown/ => [sub { do this }, sub {and this }], qr/another(one)bites(the)dust)/ => undef, );
    Now the execution engine would look like:
    $_ = some line; my @tmp = @actions; while (@tmp) { my ($qr, $ops) = splice @actions, 0, 2; if (my @matches = /$qr/) { if (not defined $ops) { $default->(@matches); } elsif (ref $ops eq 'CODE') { $ops->(@matches); } elsif (ref $ops eq 'ARRAY') { $_->(@matches) for @$ops; } else { die "bad" } last; } }
    I'm sure you can figure out where to make these object instance variables. Hope this helps.

    -- Randal L. Schwartz, Perl hacker

Re: Speed vs Laziness
by aardvark (Pilgrim) on Feb 27, 2001 at 07:30 UTC
    You might want to check out Lincoln Stein's home page. He's got a section called Cute Tricks with Perl and Apache. It talks a lot about parsing log files and it even gives some tips on regexes for parsing them

    You also may want to check out Akira Hangai's Apache::ParseLog. It uses pre-complied regex's and there are a bunch of methods to get at your data.

    Good Luck

    Get Strong Together!!

Re: Speed vs Laziness
by MeowChow (Vicar) on Feb 27, 2001 at 08:04 UTC
    You may find some of the "OR-logic" regex optimizations discussed in this thread on SAS logfile parsing to be of interest.
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print