kyle has asked for the wisdom of the Perl Monks concerning the following question:
I sometimes find long-time Perl programmers sprinkle /o over every regular expression they ever write. I just had occasion to see some code written very recently that even had the option applied to qr, which I thought is the new and improved way to do what /o used to do.
I had this hazy (and I now think faulty) memory that /o doesn't even do anything anymore—that perl now figures out whether a pattern needs to be recompiled and skips it anyway. To test this, I tried using Benchmark.
I used Perl 5.10.0.
use Benchmark qw( cmpthese timethese ); my @alphabet = ( 0 .. 9, 'a' .. 'z', 'A' .. 'Z' ); my $h = horrid_rx(1_000); my $qr_h = qr/$h/; my $qr_ho = qr/$h/o; my $s = join q{}, map { $alphabet[ rand @alphabet ] } 1 .. 1_000; my $matchiness = ( $s =~ /$h/ ) ? 'matches' : 'does not match'; print "horrid rx $matchiness string\n"; my $loops = 1_000; cmpthese( -2, { '//' => sub { $s =~ /$h/ for (1..$loops) }, '/o' => sub { $s =~ /$h/o for (1..$loops) }, 'qr' => sub { $s =~ $qr_h for (1..$loops) }, 'qr/o' => sub { $s =~ $qr_ho for (1..$loops) }, } ); sub horrid_rx { my ($n) = @_; my @quant = ( '*', '?', '+', '{0,1}', ); my $out; for ( 1 .. $n ) { $out .= $alphabet[ rand @alphabet ]; $out .= $quant[ rand @quant ]; } return $out; }
Typical results look like this:
horrid rx does not match string Rate // qr/o qr /o // 129/s -- -83% -83% -88% qr/o 752/s 483% -- -0% -27% qr 753/s 483% 0% -- -27% /o 1037/s 704% 38% 38% --
That seems to show that using /o—even on a pattern based on a variable that never changes—does actually help. It even beats having to go through the interface of a Regexp object.
Then on one run, I got this:
horrid rx does not match string Rate // qr qr/o /o // 44.4/s -- -29% -29% -32% qr 62.7/s 41% -- -1% -4% qr/o 63.0/s 42% 1% -- -4% /o 65.4/s 47% 4% 4% --
Now I'm confused. All of these ran significantly slower, so I'm guessing that the pattern and string combination it picked are unusually expensive to fail. In light of that, I'd expect it to somewhat hide the overhead of the compilation, but this seems like more than I'd expect.
So what's going on here? And what of my original question? How useful is it to put /o on a match?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: How useful is the /o regexp modifier?
by Limbic~Region (Chancellor) on Feb 03, 2009 at 18:29 UTC | |
by kyle (Abbot) on Feb 03, 2009 at 20:08 UTC | |
by JavaFan (Canon) on Feb 03, 2009 at 22:01 UTC | |
by hbm (Hermit) on Feb 03, 2009 at 22:31 UTC | |
|
Re: How useful is the /o regexp modifier?
by moritz (Cardinal) on Feb 03, 2009 at 20:27 UTC | |
by kyle (Abbot) on Feb 04, 2009 at 04:03 UTC | |
by jethro (Monsignor) on Feb 04, 2009 at 14:19 UTC | |
|
Re: How useful is the /o regexp modifier?
by jethro (Monsignor) on Feb 03, 2009 at 20:11 UTC |