Derek2 has asked for the wisdom of the Perl Monks concerning the following question:
Running this with (foob|) my results are:#!/usr/bin/perl $string = "foofoo catbar"; for ($i = 0; $i<5000000; $i++) { if ($string =~ m/(foob|)foofoo/) { } # if ($string =~ m/(foob)?foofoo/) { } }
12.180u 0.010s 0:12.19 100.0%Running this with (foob)? my results are:
13.090u 0.010s 0:13.09 100.0%Can someone explain technically why (foob|) gives better
edited: Sat Sep 28 02:28:25 2002 by jeffa - code tags, formatting
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Difference between (foo|) and (foo)?
by tadman (Prior) on Sep 28, 2002 at 05:31 UTC | |
With that in mind, the results are different: The variance changes, but the question mark wins every time. | [reply] [d/l] [select] |
by traveler (Parson) on Sep 28, 2002 at 18:28 UTC | |
This shows the ro doing better than even the qs. I suppose that (foob|) vs (|foob) depends on the data(?). --traveler | [reply] [d/l] [select] |
by theorbtwo (Prior) on Sep 29, 2002 at 01:00 UTC | |
(|foob) is the same as (foob)??; (foob|) would be (foob)?, so comparing them isn't really fair. Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node). | [reply] [d/l] [select] |
by Anonymous Monk on Sep 30, 2002 at 19:07 UTC | |
I altered the test to benchmark each string seperately:
If we benchmark the strings seperately, we get the following:
ddouville@linuxdld:~> ./test.pl
Benchmark: timing 1500000 iterations of foo_or_0, foo_or_1, foo_or_2, foo_or_3, foo_or_4, foo_qs_0, foo_qs_1, foo_qs_2, foo_qs_3, foo_qs_4...
foo_or_0: 2 wallclock secs ( 2.21 usr + 0.00 sys = 2.21 CPU) @ 678733.03/s (n=1500000)
foo_or_1: 3 wallclock secs ( 2.13 usr + 0.00 sys = 2.13 CPU) @ 704225.35/s (n=1500000)
foo_or_2: 0 wallclock secs ( 0.72 usr + 0.00 sys = 0.72 CPU) @ 2083333.33/s (n=1500000)
foo_or_3: -1 wallclock secs ( 0.50 usr + -0.01 sys = 0.49 CPU) @ 3061224.49/s (n=1500000)
foo_or_4: 1 wallclock secs ( 1.26 usr + 0.00 sys = 1.26 CPU) @ 1190476.19/s (n=1500000)
foo_qs_0: 1 wallclock secs ( 2.07 usr + 0.00 sys = 2.07 CPU) @ 724637.68/s (n=1500000)
foo_qs_1: 2 wallclock secs ( 2.02 usr + 0.00 sys = 2.02 CPU) @ 742574.26/s (n=1500000)
foo_qs_2: 0 wallclock secs ( 0.66 usr + 0.00 sys = 0.66 CPU) @ 2272727.27/s (n=1500000)
foo_qs_3: 2 wallclock secs ( 0.49 usr + 0.00 sys = 0.49 CPU) @ 3061224.49/s (n=1500000)
foo_qs_4: 2 wallclock secs ( 1.04 usr + 0.00 sys = 1.04 CPU) @ 1442307.69/s (n=1500000)
Rate foo_or_0 foo_or_1 foo_qs_0 foo_qs_1 foo_or_4 foo_qs_4 foo_or_2 foo_qs_2 foo_qs_3 foo_or_3
foo_or_0 678733/s -- -4% -6% -9% -43% -53% -67% -70% -78% -78%
foo_or_1 704225/s 4% -- -3% -5% -41% -51% -66% -69% -77% -77%
foo_qs_0 724638/s 7% 3% -- -2% -39% -50% -65% -68% -76% -76%
foo_qs_1 742574/s 9% 5% 2% -- -38% -49% -64% -67% -76% -76%
foo_or_4 1190476/s 75% 69% 64% 60% -- -17% -43% -48% -61% -61%
foo_qs_4 1442308/s 113% 105% 99% 94% 21% -- -31% -37% -53% -53%
foo_or_2 2083333/s 207% 196% 188% 181% 75% 44% -- -8% -32% -32%
foo_qs_2 2272727/s 235% 223% 214% 206% 91% 58% 9% -- -26% -26%
foo_qs_3 3061224/s 351% 335% 322% 312% 157% 112% 47% 35% -- -0%
foo_or_3 3061224/s 351% 335% 322% 312% 157% 112% 47% 35% 0% --
Interesting results. benchmark reports the opposite for "foofoo catbar" (my original string) than UNIX 'time' command did. Am I reading the results correctly? I compared foo_or_N to foo_qs_N and this is what I have: "foofoo catbar": QS wins by -6% "foofoofoo catbar": QS wins by -5% "foo foo cat bar": QS wins by -8% "foo flew over the": OR and QS tie 0% "cufoofoo nest": QS wins by -17% That leaves a question: is there a pattern situation where OR can win? | [reply] |
|
Re: Difference between (foo|) and (foo)?
by BrowserUk (Patriarch) on Sep 28, 2002 at 06:52 UTC | |
This intigued me and I thought maybe the 7.5% difference was in the time it took to parse and/or compile the differences in the regexes. So I thought I'd benchmark the test with them pre-compiled. The results are very intriguing. Not only does the difference between the two compiled versions remain pretty much the same, if anything getting slightly bigger. The pre-compiled versions actually run substantially more slowly than their none pre-compiled counterparts? This is most extreme in the case of the (foob|) version running close to 40% faster than its precompiled counterpart. I'd like to see the explanation behind them onions? Probably my test methodology at fault, but I can't see it. It took that a stage further and applied study to the searched string. This resulted in a speed-up of the slowest (precompiled (foob)?) and the fastest (the non-precompiled (foob|)), but consistantly slowed the other two varients down. Intriguing indeed. The test code and results are below Read more... (4 kB) | [reply] [d/l] |
by Anonymous Monk on Sep 30, 2002 at 19:53 UTC | |
Here's the code:
#!/usr/bin/perl
no warnings;
use strict;
use Benchmark qw(cmpthese);
$::string = "foofoo catbar";
$::re_foobOrNowt = qr/(foob|)foofoo/o;
$::re_foob0or1 = qr/(foob)?foofoo/o;
$::foobOrNowt = qr/(foob|)foofoo/;
$::foob0or1 = qr/(foob)?foofoo/;
#study $::string; print 'After studying the searched string'.$/;
cmpthese( 1000000, {
foobOrNowt => 'if ($string =~ $::foobOrNowt) { };',
foob0or1 => 'if ($string =~ $::foob0or1) { };',
c_foobOrNowt=> 'if ($string =~ $::re_foobOrNowt) { };',
c_foob0or1 => 'if ($string =~ $::re_foob0or1 ) { };',
});
Here's the results:
ddouville@linuxdld:~> ./test2.pl
Benchmark: timing 1000000 iterations of c_foob0or1, c_foobOrNowt, foob0or1, foobOrNowt...
c_foob0or1: 3 wallclock secs ( 1.92 usr + 0.00 sys = 1.92 CPU) @ 520833.33/s (n=1000000)
c_foobOrNowt: 2 wallclock secs ( 1.73 usr + 0.00 sys = 1.73 CPU) @ 578034.68/s (n=1000000)
foob0or1: 1 wallclock secs ( 1.96 usr + 0.00 sys = 1.96 CPU) @ 510204.08/s (n=1000000)
foobOrNowt: 2 wallclock secs ( 1.99 usr + 0.02 sys = 2.01 CPU) @ 497512.44/s (n=1000000)
Rate foobOrNowt foob0or1 c_foob0or1 c_foobOrNowt
foobOrNowt 497512/s -- -2% -4% -14%
foob0or1 510204/s 3% -- -2% -12%
c_foob0or1 520833/s 5% 2% -- -10%
c_foobOrNowt 578035/s 16% 13% 11% --
Code:
#!/usr/bin/perl
no warnings;
use strict;
use Benchmark qw(cmpthese);
$::string = "foofoo catbar";
$::re_foobOrNowt = qr/(foob|)foofoo/o;
$::re_foob0or1 = qr/(foob)?foofoo/o;
$::foobOrNowt = qr/(foob|)foofoo/;
$::foob0or1 = qr/(foob)?foofoo/;
#study $::string; print 'After studying the searched string'.$/;
cmpthese( 1000, {
foobOrNowt => 'for (1..10000) { if ($string =~ $::foobOrNowt) { };}',
foob0or1 => 'for (1..10000) { if ($string =~ $::foob0or1) { };}',
c_foobOrNowt=> 'for (1..10000) { if ($string =~ $::re_foobOrNowt) { };}',
c_foob0or1 => 'for (1..10000) { if ($string =~ $::re_foob0or1 ) { };}',
});
Results
ddouville@linuxdld:~> ./test2.pl
Benchmark: timing 1000 iterations of c_foob0or1, c_foobOrNowt, foob0or1, foobOrNowt...
c_foob0or1: 16 wallclock secs (15.97 usr + 0.00 sys = 15.97 CPU) @ 62.62/s (n=1000)
c_foobOrNowt: 16 wallclock secs (16.25 usr + 0.00 sys = 16.25 CPU) @ 61.54/s (n=1000)
foob0or1: 17 wallclock secs (16.34 usr + 0.00 sys = 16.34 CPU) @ 61.20/s (n=1000)
foobOrNowt: 17 wallclock secs (17.32 usr + 0.00 sys = 17.32 CPU) @ 57.74/s (n=1000)
Rate foobOrNowt foob0or1 c_foobOrNowt c_foob0or1
foobOrNowt 57.7/s -- -6% -6% -8%
foob0or1 61.2/s 6% -- -1% -2%
c_foobOrNowt 61.5/s 7% 1% -- -2%
c_foob0or1 62.6/s 8% 2% 2% --
My tests show a performance increase in the compiled versions.
Finally, here's your initial test (unaltered), run on my own machine for base-line comparison: ddouville@linuxdld:~> ./test2.pl Benchmark: timing 1000000 iterations of c_foob0or1, c_foobOrNowt, foob0or1, foobOrNowt... c_foob0or1: 2 wallclock secs ( 1.92 usr + 0.00 sys = 1.92 CPU) @ 520833.33/s (n=1000000) c_foobOrNowt: 1 wallclock secs ( 1.78 usr + 0.00 sys = 1.78 CPU) @ 561797.75/s (n=1000000) foob0or1: 0 wallclock secs ( 1.33 usr + 0.00 sys = 1.33 CPU) @ 751879.70/s (n=1000000) foobOrNowt: 1 wallclock secs ( 1.34 usr + 0.00 sys = 1.34 CPU) @ 746268.66/s (n=1000000) Rate c_foob0or1 c_foobOrNowt foobOrNowt foob0or1 c_foob0or1 520833/s -- -7% -30% -31% c_foobOrNowt 561798/s 8% -- -25% -25% foobOrNowt 746269/s 43% 33% -- -1% foob0or1 751880/s 44% 34% 1% -- These test results agree with your test results, supporting that your results are correct for the test you performed. | [reply] |
by Anonymous Monk on Sep 30, 2002 at 19:57 UTC | |
Sorry, forgot to format that.
Finally, here's your initial test (unaltered), run on my own machine for base-line comparison:
ddouville@linuxdld:~> ./test2.pl
Benchmark: timing 1000000 iterations of c_foob0or1, c_foobOrNowt, foob0or1, foobOrNowt...
c_foob0or1: 2 wallclock secs ( 1.92 usr + 0.00 sys = 1.92 CPU) @ 520833.33/s (n=1000000)
c_foobOrNowt: 1 wallclock secs ( 1.78 usr + 0.00 sys = 1.78 CPU) @ 561797.75/s (n=1000000)
foob0or1: 0 wallclock secs ( 1.33 usr + 0.00 sys = 1.33 CPU) @ 751879.70/s (n=1000000)
foobOrNowt: 1 wallclock secs ( 1.34 usr + 0.00 sys = 1.34 CPU) @ 746268.66/s (n=1000000)
Rate c_foob0or1 c_foobOrNowt foobOrNowt foob0or1
c_foob0or1 520833/s -- -7% -30% -31%
c_foobOrNowt 561798/s 8% -- -25% -25%
foobOrNowt 746269/s 43% 33% -- -1%
foob0or1 751880/s 44% 34% 1% --
These test results agree with your test results, supporting that your results are correct for the test you performed.
| [reply] |