Chuma has asked for the wisdom of the Perl Monks concerning the following question:
Dear monks,
I'm trying to search for several regexes in some long files. To speed things up, I tried first checking a combined regex, to see if any of them matches the line. Like so:
my $comb=join('|',@ARGV); while($line=<$infile>){ if($line=~$comb){ for $target(@ARGV){ if($line=~$target){ # do a thing }}}}
This seems to speed things up, at least when the regexes are just plain words. But: If I try input regexes which are anchored ("^word"), then suddenly it's much slower! Is there some weirdness with alternations and anchors? Or did I make some obvious mistake?
(I could rewrite ^aaa|^bbb|^ccc as ^(aaa|bbb|ccc), but it might be that only some of the inputs are anchored.)
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Alternations and anchors
by GrandFather (Saint) on Mar 30, 2025 at 23:56 UTC | |
Re: Alternations and anchors (trie optimization)
by LanX (Saint) on Mar 31, 2025 at 01:16 UTC | |
by jwkrahn (Abbot) on Mar 31, 2025 at 01:41 UTC | |
by LanX (Saint) on Mar 31, 2025 at 01:51 UTC | |
by Chuma (Scribe) on Apr 02, 2025 at 16:33 UTC | |
by LanX (Saint) on Apr 02, 2025 at 16:50 UTC | |
by cavac (Prior) on Apr 03, 2025 at 11:44 UTC | |
by LanX (Saint) on Apr 03, 2025 at 12:09 UTC | |
|