in reply to Re: RegEx Perl Code
in thread RegEx Perl Code

I'm not convinced that so called "compiled regular expressions" are significantly faster. Here's a benchmark, matching IP addresses. One using the 'variable' approach, and one using compiled regular expressions:
#!/usr/bin/perl use strict; use warnings 'all'; use Benchmark; use vars qw /$ip_v $ip_re @data/; my $quad_v = q '(?:25[0-5]|2[0-4]\d|1\d\d|\d\d?)'; my $quad_re = qr '(?:25[0-5]|2[0-4]\d|1\d\d|\d\d?)'; my $sep_v = q '\.'; my $sep_re = qr '\.'; $ip_v = qq "$quad_v$sep_v$quad_v$sep_v$quad_v$sep_v$quad_v"; + $ip_re = qr "$quad_re$sep_re$quad_re$sep_re$quad_re$sep_re$quad_ +re"; @data = map {join "." => map {int rand 1000} 1 .. 4} 1 .. 1_000; timethese -5 => { var => 'for (@data) {/$ip_v/}', re => 'for (@data) {/$ip_re/}', }; __END__
Running this results in:
Benchmark: running re, var for at least 5 CPU seconds...
        re:  5 wallclock secs ( 5.25 usr +  0.00 sys =  5.25 CPU) @ 46.10/s (n=242)
       var:  5 wallclock secs ( 5.16 usr +  0.00 sys =  5.16 CPU) @ 45.93/s (n=237)
Not what I call a significant win for compiled regular expressions. Perhaps you have examples where the gain is large - I've yet to encounter them.

Abigail

Replies are listed 'Best First'.
Re: Re: RegEx Perl Code
by Anonymous Monk on Jul 25, 2002 at 02:21 UTC
    Perhaps you have examples where the gain is large - I've yet to encounter them.

    jryan's benchmark code mentioned above can be tweaked with qr for a significant speed optimization. Change the line     my @patterns = ('B.B', 'CB')x10; to     my @patterns = map qr/$_/, ('B.B', 'CB')x10; in the &without subroutine.

    Your benchmark doesn't show any support for compiled patterns, and that's due to an optimization I've already described here.

    Cheers,
    -Anomo
Re: Re: RegEx Perl Code
by jryan (Vicar) on Jul 23, 2002 at 23:14 UTC

    I ran a few benchmarks here on a "contrived" large regex and large dataset; although the speed increase through qr// isn't what you'd call large, it is at least significant. Should be noted, though, that the the so-called "evil" /o was the winner in this case.

    At any rate, thats not to say qr// is bad. Personally, I think its great because it encourages modularity and readability in regular expressions.

      When I run the benchmark you are referring to, I get a slight win for "None" over "qr". Your results showed a difference of about 10% - not what I call significant.

      I never said qr// is bad. I just argued against blindly using qr// for no other reason than <quote>it's a good thing to do</quote>. About the only benefit I've seen from qr// is that it interpolates like regular expressions, and not like double quoted strings. But the interpolation is only slightly different.

      Abigail