skx has asked for the wisdom of the Perl Monks concerning the following question:
Greetings fellow monks
I have a problem of efficiency in a little tool I'm writing.
Essentially I have two arrays, one containing lines from a logfile, as produced by syslog, and one containing regular expressions.
I wish to remove from the loglines array all lines which match any of the regular expressions in my list.
My current code looks like this:
my @REGEXPS = ( "^\w{3} [ :0-9]{11} [._[:alnum:]-]+ pppd\[[0-9]+\]: (sent|rcvd) \[L +CP EchoReq id=[[:alnum:]]+ magic=[ [:alnum:]]+\]$", "^\w{3} [ :0-9]{11} [._[:alnum:]-]+ ssh\(pam_[[:alnum:]]+\)\[[0-9]+ +\]: session opened for user [[:alnum:]-]+ by \(uid=[0-9]+\)$", ); my @KEEP = (); foreach my $line ( @SYSLOG ) { my $match = 0; foreach my $r ( @REGEXPS ) { if ( $line =~ /$r/) { $match = 1; } } if (! $match ) { push @KEEP, $line; } }
This is slow, because I'm testing each regular expression against each line, making the number of matchs N x N.
What are my alternatives?
I've considered moving the lines into a hash instead, and using delete upon them - but I'm not sure how much of a gain this would be.
I can't help thinking I should be using map, or grep for this - but I'm not entirely sure how to do this.
(Yes this is replicating the functionality of the logcheck tool - it's a prototype rewrite in perl I'm producing).
Steve
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Removing all lines from an array which match against a second array
by broquaint (Abbot) on Sep 26, 2003 at 13:41 UTC | |
by merlyn (Sage) on Sep 26, 2003 at 14:15 UTC | |
|
Re: Removing all lines from an array which match against a second array
by tilly (Archbishop) on Sep 28, 2003 at 01:01 UTC |