in reply to Re: Re: Re: First Pattern Matching
in thread First Pattern Matching

I'm glad you brought up this point. I too am a huge fan of qr; however, I think this situation is a perfect use of the /o operator. I assumed that the snippet the author posted was but a morsal of his actual code; he probably uses dozens of patterns and thousands of lines of input. Compiling the regex with /o (rather than building it with qr) is ideal for this situation where a single regex is to be applied to huge amounts of data. It will result in a speed boost. For instance, I modified my earlier code and ran this benchmark:

use Benchmark; timethese(1000, { Slasho => \&withslasho, None => \&without, qr => \&withqr }); sub withslasho { my $str1 = 'ABCBXBCA'; my $str2 = 'APCBXBCAC'; my @array = ($str1, $str2) x 500; my @patterns = ('B.B', 'CB')x10; my $pat = join '|',@patterns; foreach my $string (@array) { if($string =~ /($pat)/o) { # do a pattern lookup to see which pattern matched. my $matched; foreach my $p (@patterns) { if ($1 =~ /$p/) { $matched = $p; last; } } } } } sub without { my $str1 = 'ABCBXBCA'; my $str2 = 'APCBXBCAC'; my @array = ($str1, $str2) x 500; my @patterns = ('B.B', 'CB')x10; my $pat = join '|',@patterns; foreach my $string (@array) { if($string =~ /($pat)/) { # do a pattern lookup to see which pattern matched. my $matched; foreach my $p (@patterns) { if ($1 =~ /$p/) { $matched = $p; last; } } } } } sub withqr { my $str1 = 'ABCBXBCA'; my $str2 = 'APCBXBCAC'; my @array = ($str1, $str2) x 500; my @patterns = ('B.B', 'CB')x10; my $pat = join '|',@patterns; $pat = qr/$pat/; foreach my $string (@array) { if($string =~ /($pat)/) { # do a pattern lookup to see which pattern matched. my $matched; foreach my $p (@patterns) { if ($1 =~ /$p/) { $matched = $p; last; } } } } }

Which outputs:

Benchmark: timing 1000 iterations of None, Slasho, qr... None: 70 wallclock secs (69.60 usr + 0.00 sys = 69.60 CPU) @ 14 +.37/s (n=1000) Slasho: 61 wallclock secs (61.24 usr + 0.00 sys = 61.24 CPU) @ 16 +.33/s (n=1000) qr: 66 wallclock secs (65.80 usr + 0.00 sys = 65.80 CPU) @ 15 +.20/s (n=1000)

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: First Pattern Matching
by Anonymous Monk on Jul 12, 2002 at 02:28 UTC
    I too am a huge fan of qr;

    Ehrrmm, I see that you hardly need a lesson in how to use qr. :)

    however, I think this situation is a perfect use of the /o operator.

    I don't, especially since it doesn't work on ActivePerl. :) But there's also another reason why I don't like to use it. As I've already said here perl optimizes away recompilation in simple cases like this. As your benchmark shows there's not much difference between with and without the o modifier. So in this case I'd prefer not using it. And the reason for that is that I'm scared of myself. One day I might put the code in a subroutine. Another day I might change it to take arguments that I use in the regex. If I forget about the o -- and I most probably will -- it might take me a while to discover the bug. If I use qr for this, or even nothing fancy at all, I won't get bitten.

    I code with a lot of personal style and hints, and for me the o says "this is a dynamically set constant, and is supposed to be that way, so don't bother. It should never change. NEVER!". Often I don't mean that, and hence I use qr instead. I'm scared of my own benchmark results though: (your code)
    Benchmark: timing 1000 iterations of None, Slasho, qr... None: 30 wallclock secs (30.08 usr + 0.14 sys = 30.22 CPU) @ 33.09/s Slasho: 29 wallclock secs (28.74 usr + 0.02 sys = 28.76 CPU) @ 34.77/s qr: 31 wallclock secs (30.49 usr + 0.00 sys = 30.49 CPU) @ 32.80/s
    For me qr is even worse. This does indeed surprise me.

    Cheers,
    -Anomo