in reply to Why does global match run faster than none global?

Intriguing. I can confirm your findings (5.10.1):

$str = '123456789'; cmpthese -1, { a=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/g; ], b=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/; ], c=>q[ my ($a) = $str =~ m/(23)/g ], d=>q[ my ($a) = $str =~ m/(23)/; ], };; Rate b a d c b 388799/s -- -14% -37% -52% a 449698/s 16% -- -27% -44% d 612922/s 58% 36% -- -24% c 806251/s 107% 79% 32% --

I can't even begin to guess why it would be so. 16% and 32% is hardly noise.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Why does global match run faster than none global?
by ikegami (Patriarch) on Aug 23, 2011 at 20:22 UTC

    Cause you're using an old(ish) version of Perl?

    Rate b a c d b 3778652/s -- -1% -19% -21% a 3817631/s 1% -- -18% -20% c 4677165/s 24% 23% -- -2% d 4766254/s 26% 25% 2% --

    This is perl 5, version 14, subversion 0 (v5.14.0) built for i686-linux-thread-multi

      Even more intriguing. The slowest has hardly changed, but the previously faster ones have slowed markedly.

      C:\test\perl-5.14.0-RC1>perl use Benchmark qw[ cmpthese ];; print $];; $str = '123456789'; cmpthese -1, { a=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/g; ], b=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/; ], c=>q[ my ($a) = $str =~ m/(23)/g ], d=>q[ my ($a) = $str =~ m/(23)/; ], };; ^Z 5.014000 Rate b a d c b 363518/s -- -17% -35% -46% a 435446/s 20% -- -22% -35% d 555991/s 53% 28% -- -17% c 668598/s 84% 54% 20% --

      But still, 20% is not to be sneezed at.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      I'm running strawberry v5.12.3 built for MSWin32-x86-multi-thread. What confuses me, is that by its nature /g seems like it should run longer.

        Why would it take longer? All the /g version does in addition to the non-/g version is check if "9" is "23". Actually, it doesn't even get that far. It knows a minimum of 4 chars is needed for another match, yet there's only one char left.

      You did ensure that $str was defined as our not my didn't you?

Re^2: Why does global match run faster than none global?
by Eliya (Vicar) on Aug 23, 2011 at 21:22 UTC

    I can't replicate your results with the two versions I currently have installed on this machine:

    $ /usr/local/bin/perl5.10.1 921987.pl Rate b a c d b 4497569/s -- -2% -21% -26% a 4591346/s 2% -- -19% -25% c 5681139/s 26% 24% -- -7% d 6116693/s 36% 33% 8% -- $ /usr/local/bin/perl5.10.1 -v This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi $ /usr/local/bin/perl5.12.2 921987.pl Rate a b c d a 4314282/s -- -8% -30% -36% b 4677165/s 8% -- -24% -31% c 6168093/s 43% 32% -- -9% d 6779346/s 57% 45% 10% -- $ /usr/local/bin/perl5.12.2 -v This is perl 5, version 12, subversion 2 (v5.12.2) built for x86_64-li +nux-thread-multi

    If there is any significant difference at all, it tends to be the other way around, i.e. /g is slower.

      No matter how long I run it for, it is remarkably consistent here with no more than 1 or 2% variation:

      C:\test\perl-5.14.0-RC1>perl use Benchmark qw[ cmpthese ];; print $];; $str = '123456789'; cmpthese -10, { a=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/g; ], b=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/; ], c=>q[ my ($a) = $str =~ m/(23)/g ], d=>q[ my ($a) = $str =~ m/(23)/; ], };; ^Z 5.014000 Rate b a d c b 357543/s -- -15% -33% -45% a 422192/s 18% -- -21% -35% d 535621/s 50% 27% -- -18% c 653518/s 83% 55% 22% --

      One difference of note is that I'm using Window rather than your Linux. Your results reflect ikegami's, who I believe was also using Linux. Perhaps the OP is on Windows?

      The 'usual suspect' for performance differences a between those two is memory allocation, but there is none worthy of note here. Indeed, there appear (as you would suspect), to be no calls at all into the OS during benchmark.

      Since were both on 64-bit intel hardware, that doesn't seem likely as a cause. Which pretty much leaves only compiler differences, with teh tentative conclusion that with the /g switch enabled, the Windows takes a code path that causes (or allows) the MSC compiler to generate a particularly efficient piece of code somewhere.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Which pretty much leaves only compiler differences...

        That was my conclusion, too.

        BTW, on second glance I noticed that your figures are about an order of magnitude slower in absolute terms, compared to what I and the others got — which made me wonder what hardware you were using.

        Don't get me wrong, I'm not into some childish "mine is bigger" kind of silly thing... Not at all, it's just that if we presume roughly comparable hardware, what might account for that order-of-magnitude difference?  Compiler differences, too?

        Just for comparison, the CPU I ran the test on is a 2.3 GHz AMD Phenom 9600 quad-core — which was already pretty "standard" at the time I bought the machine 3 years ago. (The quad-core should be irrelevant here, as the benchmark uses one core only anyway.)

      I think we (or at least I) may be chasing noise. I can't replicate the larger 20% number. Additionally I get noise in both directions on multiple runs. My guess would be over enough iterations we'd see a slightly slower /g.

      use Benchmark qw[ cmpthese ];; my $str = '123456789'; cmpthese -1, { a=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/g; ], b=>q[ my ($a,$b) = $str =~ m/(23)[^8]+(8)/; ], c=>q[ my ($a) = $str =~ m/(23)/g ], d=>q[ my ($a) = $str =~ m/(23)/; ], }; Rate a b c d a 7047422/s -- -2% -26% -29% b 7218432/s 2% -- -25% -28% c 9578119/s 36% 33% -- -4% d 9960542/s 41% 38% 4% -- Rate a b c d a 7143583/s -- 2% -24% -24% b 7005183/s -2% -- -25% -25% c 9378794/s 31% 34% -- -0% d 9387510/s 31% 34% 0% --

      You did ensure that $str was defined as our not my didn't you?

Re^2: Why does global match run faster than none global?
by will.ni (Initiate) on Aug 25, 2011 at 03:47 UTC
    windows 5.010000 Rate b a d c b 299030/s -- -7% -43% -53% a 320550/s 7% -- -39% -49% d 525288/s 76% 64% -- -17% c 631310/s 111% 97% 20% -- HP-UX 5.008008 Rate b a d c b 225468/s -- -7% -25% -30% a 243327/s 8% -- -19% -25% d 300755/s 33% 24% -- -7% c 322947/s 43% 33% 7% --