And the result of the experiment is as follows -use strict; use Benchmark qw/timethese cmpthese/; # use re 'debug'; chomp(my @lines = <DATA>); my $target = shift || 'word'; my $re_positive_lookahead = qr/^a=<(?=.*?word)/; my $re_loose = qr/a=<(.*?$target.*?)>/; cmpthese( timethese(1000000, { 'Match_L' => '&Match_Loose', 'Match_P' => '&Match_PLAhead_plus_Substr', }) ); sub Match_Loose { foreach (@lines) { if (/$re_loose/) { my $word = $1; } } } sub Match_PLAhead_plus_Substr { foreach (@lines) { if (/$re_positive_lookahead/) { my $word = substr($_, 3, length($_)-4); } } } __DATA__ a=<swords> a=<wordy> b=<rappinghood> a=<thisword> b=<thatword> a=<foreword> b=<junk> a=<nothing> b=<word> b=<wordplay> b=<end>
The observation is that positive lookahead combined with substr is about 12% faster than straight regexp with capture, which is not an insignificant speed improvement.Benchmark: timing 1000000 iterations of Match_L, Match_P... Match_L: 31 wclock secs (31.16 usr + 0.00 sys = 31.16 CPU) @ 32092.4 +3/s Match_P: 27 wclock secs (27.64 usr + 0.00 sys = 27.64 CPU) @ 36179.4 +5/s Rate Match_L Match_P Match_L 32092/s -- -11% Match_P 36179/s 13% --
In reply to Re: Removing backtracking from a .*? regexp
by Roger
in thread Removing backtracking from a .*? regexp
by grinder
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |