in reply to Ovid, Long Live .*? (dot star question-mark)

Well, my name's in the title, I have to respond, right? :)

Are you sure that's 5.6? I was playing around with that when I saw your post on the beginner's list. I've gotten the following:

cmpthese(-3, { cc => sub { 'asdfasdfXasdfasdfXsdfasdfXasdfasdf111XYZ' =~ /(?:[^ +X]*X)+?YZ/ }, ds => sub { 'asdfasdfXasdfasdfXsdfasdfXasdfasdf111XYZ' =~ /.*?XY +Z/ }, });

Results:

Benchmark: running cc, ds, each for at least 3 CPU seconds... cc: 4 wallclock secs ( 3.12 usr + 0.00 sys = 3.12 CPU) +@ 101652.97/s (n=316649) ds: 4 wallclock secs ( 3.12 usr + 0.00 sys = 3.12 CPU) +@ 243115.24/s (n=759492) Rate cc ds cc 101653/s -- -58% ds 243115/s 139% --

And the version:

C:\>perl -v This is perl, v5.6.0 built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2000, Larry Wall Binary build 620 provided by ActiveState Tool Corp. http:/ +/www.ActiveState.com Built 18:31:05 Oct 31 2000

Update: I upgraded to 5.6.1, build 629 and am getting similar results:

Benchmark: running cc, ds, each for at least 3 CPU seconds... cc: 3 wallclock secs ( 3.19 usr + 0.01 sys = 3.20 CPU) +@ 116044.62/s (n=371923) ds: 3 wallclock secs ( 3.13 usr + 0.00 sys = 3.13 CPU) +@ 250170.39/s (n=784034) Rate cc ds cc 116045/s -- -54% ds 250170/s 116% --

Just to be safe, I used your exact cmpthese code above:

Benchmark: running cc, ds, each for at least 3 CPU seconds... cc: 2 wallclock secs ( 3.08 usr + 0.01 sys = 3.09 CPU) +@ 319288.07/s (n=987558) ds: 3 wallclock secs ( 3.19 usr + -0.01 sys = 3.18 CPU) +@ 292251.18/s (n=930820) Rate ds cc ds 292251/s -- -8% cc 319288/s 9% --

After running it several times to get consistent results, I see that the optimization still seems to be dependant on the text one is matching against. Am I just missing something obvious here? Do the 'non-capturing' parens throw it off? My original code comes from your reply on the beginner's list.

Cheers,
Ovid

P.S.: if you're one of the three Perlmonks who doesn't "get" the reference in japhy's title, see Death to Dot Star!

Vote for paco!

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: (Ovid) Re: Ovid, Long Live .*? (dot star question-mark)
by japhy (Canon) on Sep 06, 2001 at 20:53 UTC
    Ok. Allow me to explain how the optimization works. If a quantifier is proceeded by an exact character, then the quantifier knows how far ahead it can jump safely, so it doesn't have to match character-by-character.

    If the quantifier is inside capturing parens, this optimization is not on, because the node after the quantifier is not an exact character, but the node for signalling the closing of the paren.

    My optimization tells perl to look beyond a closing paren. Technically, it should also look past ANY closing parens (not just one). And perhaps code evaluations as well...

    The optimization does depend on the frequency of that "jump ahead" character.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;