in reply to About Greedy and Non-Greedy Regular Expressions

Whew! That's hard to read. But -- hoping I've read it correctly,
#!C:/perl/bin -w use strict; # regextiming.pl use vars qw( $string $ms $ms2 ); $string = "fred and barney went bowling last night"; { $ms = Win32::GetTickCount(); if ( $string =~ /fred.+barney/) { } else { print "No match\n\n"; } print "This match used ",Win32::GetTickCount() - $ms, " ticks on a w2k + box\n\n"; } # alt --OP expects this to be quicker { $ms2 = Win32::GetTickCount(); if ( $string =~ /fred.+?barney/) { } else { print "No match via second refex\n\n"; exit(); } print "The non-greedy match used ",Win32::GetTickCount() - $ms2, " tic +ks on a w2k box.\n"; }

OUTPUT:
F:\_wo\pl_test>perl regextiming.pl
This match used 0 ticks on a w2k box

The non-greedy match used 0 ticks on a w2k box

Update:fixed spelling, grammar Prior discussion found with a casual search suggests that the identical timines times, to no_better_than one millisecond, is are or was were a limitation of w32. Opinion was divided as to whether Time::HiRes could do better.

So...

# regextiming2.pl #!C:/perl/bin -w use strict; use Time::HiRes 'time'; # regextiming2.pl use vars qw( $string $start $start2 $end $end2 ); $string = "fred and barney went bowling last night"; { $start = time(); if ( $string =~ /fred.+barney/) { } else { print "No match\n\n"; } $end = time(); print "This match used ",$end - $start, " as measured by Time::HiRes o +n a w2k box\n\n"; } { $start2 = time(); if ( $string =~ /fred.+barney/) { } else { print "No match\n\n"; } $end2 = time(); print "The second match used ",$end2 - $start2, " as measured by Time: +:HiRes on a w2k box\n\n"; }

OUTPUT2:

F:\_wo\pl_test>perl regextiming2.pl
This match used 1.00135803222656e-005 as measured by Time::HiRes on a w2k box

The second match used 6.91413879394531e-006 as measured by Time::HiRes on a w2k box

QED ... I hope.

Replies are listed 'Best First'.
Re^2: About Greedy and Non-Greedy Regular Expressions
by RMGir (Prior) on Apr 25, 2007 at 14:36 UTC
    A test with longer strings shows that the non-greedy regex can be about 25% faster (with perl 5.8.7). My test case regex is designed to match starting near the beginning of the string and ending in the middle.
    $ perl -MBenchmark=cmpthese -ne'chomp; $allwords.="/$_"; END{ cmpthese(-3, { "Greedy"=>sub { $allwords=~m{/abbreviate.+/initial/}; }, "Nongreedy"=>sub { $allwords=~m{/abbreviate.+?/initial/}; } } ); }' /usr/dict/words
    The result is:
               Rate    Greedy Nongreedy
    Greedy    439/s        --      -22%
    Nongreedy 563/s       28%        --
    
    Of course, to repeat this test you'll need /usr/dict/words, and it has to contain abbreviate and initial.

    These results are with perl 5.8.7 - I'd be curious to see what happens with 5.8.10, since RE improvements could happen any time.


    Mike