in reply to Re: Bulk Regex?
in thread Bulk Regex?

Sorry, he was already using /o. I don't think qr// buys you anything over /$var/o, but I could be wrong.

You're right, though, it _is_ a faq:

What is "/o" really for? Using a variable in a regular expression match forces a re-evaluat +ion (and perhaps recompilation) each time the regular expression is encountered. The "/o" modifier locks in the regex the first time i +t's used. This always happens in a constant regular expression, and in + fact, the pattern was compiled into the internal format at the same time + your entire program was. Use of "/o" is irrelevant unless variable interpolation is used in + the pattern, and if so, the regex engine will neither know nor care wh +ether the variables change after the pattern is evaluated the *very firs +t* time. "/o" is often used to gain an extra measure of efficiency by not performing subsequent evaluations when you know it won't matter (b +ecause you know the variables won't change), or more rarely, when you don +'t want the regex to notice if they do. For example, here's a "paragrep" program: $/ = ''; # paragraph mode $pat = shift; while (<>) { print if /$pat/o; }

Edit:Time for Benchmark:

#!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); my $pattern=join "|",qw(a ab abc abcde ef gh ghij qrst nqmz stuv); my $qr=qr/$pattern/; my @candidates=qw(abc zzz stuv asasdfasdfaf qqqqqqqqqqqqqqqq lkasjh ad +sd qqqqqqqqqqqqabc); sub preComp { my $result=scalar grep /$pattern/o,@candidates; return $result; } sub preCompQR { my $result=scalar grep /$qr/o,@candidates; return $result; } sub noPrecomp { my $result=scalar grep /$pattern/,@candidates; return $result; } print "Working from pattern '$pattern', qr='$qr'\n"; print "preComp finds ",preComp(),"\n"; print "preCompQR finds ",preCompQR(),"\n"; print "noPrecomp finds ",noPrecomp(),"\n"; cmpthese(-3, { preComp => \&preComp, preCompQR => \&preCompQR, noPrecomp => \&noPrecomp } );
On perl 5.6.1, here are the results:
Working from pattern 'a|ab|abc|abcde|ef|gh|ghij|qrst|nqmz|stuv', qr='( +?-xism:a|ab|abc|abcde|ef|gh|ghij|qrst|nqmz|stuv)' preComp finds 6 preCompQR finds 6 noPrecomp finds 6 Benchmark: running noPrecomp, preComp, preCompQR, each for at least 3 +CPU seconds... noPrecomp: 4 wallclock secs ( 3.09 usr + 0.01 sys = 3.10 CPU) @ 33 +414.22/s (n=103417) preComp: 4 wallclock secs ( 3.24 usr + 0.01 sys = 3.25 CPU) @ 35 +360.06/s (n=115097) preCompQR: 2 wallclock secs ( 3.08 usr + 0.02 sys = 3.10 CPU) @ 35 +094.39/s (n=108933) Rate noPrecomp preCompQR preComp noPrecomp 33414/s -- -5% -6% preCompQR 35094/s 5% -- -1% preComp 35360/s 6% 1% --
I'd be interested to see 5.8.0 results; the regex engine is one of the places that got tweaked quite a bit, I think.
--
Mike