but I'm concerned about speed. If its doing this for ever file on a terabyte server I'm worried about the time consumption. What do you think?Just the fact that you hide a loop as regexp alternatives doesn't mean it's suddenly orders of a magnitude faster. In fact, it might as well be that splitting the regexp in smaller chunks is faster, because the optimizer kicks in.
Here's a benchmark:
Now, for your particular data set results might be different. But don't assume alternatives are necessarely slower.#!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @regexes = ( '.*\.jpg$', '.*\.png$', 'Perl', '\.mozilla/abigail', ); our @words = `find /home/abigail`; # 38517 files. our ($c1, $c2); cmpthese -60 => { single => 'my $regex = join "|" => @regexes; $c1 = 0; for my $w (@words) { $c1 ++ if $w =~ /$regex/ }', many => '$c2 = 0; WORD: for my $w (@words) { for my $r (@regexes) { $c2 ++, next WORD if $w =~ /$r/ } }', }; die "Unequal\n" unless $c1 == $c2; __END__ s/iter single many single 4.86 -- -74% many 1.28 281% --
Abigail
In reply to Re: Returning regexp pattern that was used to match
by Abigail-II
in thread Returning regexp pattern that was used to match
by crabbdean
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |