in reply to Re: japhy blabs about regexes (again)
in thread japhy blabs about regexes (again)

Here's my output (from bleadperl):
Benchmark: running F_plus, F_sexeger, F_while, P_plus, P_sexeger, P_wh +ile, each for at least 5 CPU seconds... F_plus: 6 wallclock secs ( 5.38 usr + 0.02 sys = 5.40 CPU) @ 38 +010.19/s (n=205255) F_sexeger: 6 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ 82 +085.16/s (n=426022) F_while: 5 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @ 99 +934.23/s (n=522656) P_plus: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 34 +659.58/s (n=180923) P_sexeger: 7 wallclock secs ( 5.11 usr + 0.00 sys = 5.11 CPU) @ 54 +039.53/s (n=276142) P_while: 6 wallclock secs ( 5.14 usr + 0.00 sys = 5.14 CPU) @ 58 +260.31/s (n=299458) Rate P_plus F_plus P_sexeger P_while F_sexeger F_whi +le P_plus 34660/s -- -9% -36% -41% -58% -6 +5% F_plus 38010/s 10% -- -30% -35% -54% -6 +2% P_sexeger 54040/s 56% 42% -- -7% -34% -4 +6% P_while 58260/s 68% 53% 8% -- -29% -4 +2% F_sexeger 82085/s 137% 116% 52% 41% -- -1 +8% F_while 99934/s 188% 163% 85% 72% 22% +--
The F stands for "fail", and the P stands for "pass". For me, the while-approach fails AND succeeds faster than the sexeger- and plus-approaches, and sexeger fails AND succeeds faster than the plus-approach.

And here's the code I ran.

#!/usr/bin/perl use Benchmark 'cmpthese'; my $X = "a b c d e f g h i j k l "; my $Y = "a b c d e f g h i j k l"; cmpthese(-5, { P_while => sub { my $x = $X; 1 while $x =~ s/\s$//; }, P_plus => sub { my $x = $X; $x =~ s/\s+$//; }, P_sexeger => sub { my $x = reverse $X; $x =~ s/^\s+//; $x = reverse $x; }, F_while => sub { my $x = $Y; 1 while $x =~ s/\s$//; }, F_plus => sub { my $x = $Y; $x =~ s/\s+$//; }, F_sexeger => sub { my $x = reverse $Y; $x =~ s/^\s+//; $x = reverse $x; }, });

_____________________________________________________
Jeff japhy Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Replies are listed 'Best First'.
Re: Re: Re: japhy blabs about regexes (again)
by runrig (Abbot) on Jul 16, 2001 at 23:28 UTC
    That explains it. If you add a few spaces to the end of your 'passing' string, then P_while will come in last. I suppose that's because of the cost in executing the regex a few more times. So if you expect its likely that there's a few spaces to truncate, better not to use the while :)

    But its still probably a good case for optimizing regexes anchored at the end of a string.