Re: About Greedy and Non-Greedy Regular Expressions

Whew! That's hard to read. But -- hoping I've read it correctly,

#!C:/perl/bin -w
use strict;

# regextiming.pl

use vars qw( $string $ms $ms2 );

$string = "fred and barney went bowling last night";

{
$ms = Win32::GetTickCount();

   if ( $string =~ /fred.+barney/) {
   } else {
      print "No match\n\n";
   }
print "This match used ",Win32::GetTickCount() - $ms, " ticks on a w2k
+ box\n\n";
}

# alt --OP expects this to be quicker

{
$ms2 = Win32::GetTickCount();

   if ( $string =~ /fred.+?barney/) {
   } else {
      print "No match via second refex\n\n";
      exit();
   }
print "The non-greedy match used ",Win32::GetTickCount() - $ms2, " tic
+ks on a w2k box.\n";
}
[download]

OUTPUT:
F:\_wo\pl_test>perl regextiming.pl
This match used 0 ticks on a w2k box

The non-greedy match used 0 ticks on a w2k box

Update:fixed spelling, grammar Prior discussion found with a casual search suggests that the identical ~~timines~~ times, to no_better_than one millisecond, is are or ~~was~~ were a limitation of w32. Opinion was divided as to whether Time::HiRes could do better.

So...

# regextiming2.pl

#!C:/perl/bin -w
use strict;
use Time::HiRes 'time';

# regextiming2.pl

use vars qw( $string $start $start2 $end $end2 );

$string = "fred and barney went bowling last night";

{
$start = time();

   if ( $string =~ /fred.+barney/) {
   } else {
      print "No match\n\n";
   }
$end = time(); 
print "This match used ",$end - $start, " as measured by Time::HiRes o
+n a w2k box\n\n";
}


{
$start2 = time();

   if ( $string =~ /fred.+barney/) {
   } else {
      print "No match\n\n";
   }
$end2 = time(); 
print "The second match used ",$end2 - $start2, " as measured by Time:
+:HiRes on a w2k box\n\n";
}
[download]

OUTPUT2:

F:\_wo\pl_test>perl regextiming2.pl
This match used 1.00135803222656e-005 as measured by Time::HiRes on a w2k box

The second match used 6.91413879394531e-006 as measured by Time::HiRes on a w2k box

QED ... I hope.

Comment on Re: About Greedy and Non-Greedy Regular Expressions Select or Download Code

Replies are listed 'Best First'.
Re^2: About Greedy and Non-Greedy Regular Expressions by RMGir (Prior) on Apr 25, 2007 at 14:36 UTC
A test with longer strings shows that the non-greedy regex can be about 25% faster (with perl 5.8.7). My test case regex is designed to match starting near the beginning of the string and ending in the middle. `$ perl -MBenchmark=cmpthese -ne'chomp; $allwords.="/$_"; END{ cmpthese(-3, { "Greedy"=>sub { $allwords=~m{/abbreviate.+/initial/}; }, "Nongreedy"=>sub { $allwords=~m{/abbreviate.+?/initial/}; } } ); }' /usr/dict/words` [download] The result is: Rate Greedy Nongreedy Greedy 439/s -- -22% Nongreedy 563/s 28% -- Of course, to repeat this test you'll need /usr/dict/words, and it has to contain abbreviate and initial. These results are with perl 5.8.7 - I'd be curious to see what happens with 5.8.10, since RE improvements could happen any time. Mike	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: About Greedy and Non-Greedy Regular Expressions
by RMGir (Prior) on Apr 25, 2007 at 14:36 UTC

$ perl -MBenchmark=cmpthese -ne'chomp; $allwords.="/$_"; 
END{
    cmpthese(-3,
       {
         "Greedy"=>sub { 
           $allwords=~m{/abbreviate.+/initial/};
         }, 
         "Nongreedy"=>sub {
           $allwords=~m{/abbreviate.+?/initial/};
         }
       }
    );
}' /usr/dict/words
[download]

           Rate    Greedy Nongreedy
Greedy    439/s        --      -22%
Nongreedy 563/s       28%        --

These results are with perl 5.8.7 - I'd be curious to see what happens with 5.8.10, since RE improvements could happen any time.

Mike

[reply]
[d/l]