Speed vs Laziness

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a module that parses logs. I've got about 20 different line types. The idea is to stuff a hash with the regex ( qq{} ), the general name of the line, and some other trivial things.

Here is what I have right now:

elsif (/..a regex goes here../) {
    if(ref($self->{_ref_killed}) eq 'CODE'){
        $self->{_ref_killed}->($1,$2,$3)
    }
    elsif(ref($self->{_ref_killed}) eq 'ARRAY'){
        foreach my $sr (@{$self->{_ref_killed}}){$sr->($1,$2,$3)}
    }
    elsif(ref($self->{_ref_default}) eq 'CODE'){
        $self->{_ref_default}->($1,$2,$3)
    }
    $self->_last_death($2);
}
[download]

"killed" would be that code chunks "name" (from reference above.)

As with all log parsing, I'm racing for time. Some benchmarks showed a simple ++ operation with and without eval:

Without Eval:      0.059999942779541
With Eval:         4.15600001811981
[download]

So eval is out of the question. As I said above, I'd personally perfer to have a hash of arrays:

$type{killed} = [$regexref,3]; # 3 to pass $1,$2, and $3
#above is not exact and has problems I'm sure.  Just for the idea.
[download]

If I can make this all a big maintainable hash, that would be far superior to having the same (generally) code repeated over and over.

Any ideas? Thanks,

Comment on Speed vs Laziness Select or Download Code

Replies are listed 'Best First'.
Re: Speed vs Laziness by merlyn (Sage) on Feb 27, 2001 at 07:21 UTC
Make your basic data structure look something like this: `my @actions = ( qr/thingy(with)parens(to)boot/ => sub { do this }, qr/smile(and)frown/ => [sub { do this }, sub {and this }], qr/another(one)bites(the)dust)/ => undef, );` [download] Now the execution engine would look like: `$_ = some line; my @tmp = @actions; while (@tmp) { my ($qr, $ops) = splice @actions, 0, 2; if (my @matches = /$qr/) { if (not defined $ops) { $default->(@matches); } elsif (ref $ops eq 'CODE') { $ops->(@matches); } elsif (ref $ops eq 'ARRAY') { $_->(@matches) for @$ops; } else { die "bad" } last; } }` [download] I'm sure you can figure out where to make these object instance variables. Hope this helps. -- Randal L. Schwartz, Perl hacker	[reply] [d/l] [select]
Re: Speed vs Laziness by aardvark (Pilgrim) on Feb 27, 2001 at 07:30 UTC
You might want to check out Lincoln Stein's home page. He's got a section called Cute Tricks with Perl and Apache. It talks a lot about parsing log files and it even gives some tips on regexes for parsing them You also may want to check out Akira Hangai's Apache::ParseLog. It uses pre-complied regex's and there are a bunch of methods to get at your data. Good Luck Get Strong Together!!	[reply]
Re: Speed vs Laziness by MeowChow (Vicar) on Feb 27, 2001 at 08:04 UTC
You may find some of the "OR-logic" regex optimizations discussed in this thread on SAS logfile parsing to be of interest. MeowChow s aamecha.s a..a\u$&owag.print	[reply]