Re^3: Splitting on escapable delimiter

Replies are listed 'Best First'.
Re^4: Splitting on escapable delimiter by mobiusinversion (Beadle) on Mar 28, 2008 at 23:40 UTC
Wow, I totally should have seen that! Okay so now that mutual correctness has been established, it is time for optimality checking. It turns out that your method is about 10% faster. I called my method unroll, and your method rollahead. Here are the benchtests: Benchmark: timing 100000 iterations of rollahead, unroll... rollahead: 18.3956 wallclock secs (17.94 usr + 0.00 sys = 17.94 CPU) @ 5575.07/s (n=100000) unroll: 22.1357 wallclock secs (20.58 usr + 0.00 sys = 20.58 CPU) @ 4859.56/s (n=100000) Rate unroll rollahead unroll 4860/s -- -13% rollahead 5575/s 15% -- and the code: use strict; use Benchmark ':all', ':hireswallclock'; my $x = "#@##@###@####@#####@"; my $y = reverse $x; my $z = "$x$x$x$y$y$x$x$y$y$y$y$y$x$x$x$x"; my $r = timethese( 100000, { unroll => sub { my @x = ([unroll($x)],[unroll($y)],[unroll($z)]) }, rollahead => sub { my @x = ([rollahead($x)],[rollahead($y)],[rollahead($z)]) }, } ); cmpthese($r); sub unroll { my @x = $_[0] =~ /(?:^\|@)((?:##\|#@\|[^#@]))/g; for(@x){ $_ =~ s/##/#/g; $_ =~ s/#@/@/g; } @x } sub rollahead { my $x = shift; $x = reverse $x; my @x = reverse(split/\@(?=(?:##)(?!#))/,$x,-1); for(@x){ $_ = reverse; $_ =~ s/##/#/g; $_ =~ s/#@/@/g; } @x } [download] Is there a monk who could explain why rollahead is faster? I was surprised considering the number of calls to reverse. I can only guess that somewhere deep inside the guts of the Perl-Regex-Beasty, that the optimizer droids are nasty hardcore with lookaround automata but can't be bothered with alternation.	[reply] [d/l]
Re^5: Splitting on escapable delimiter by Anonymous Monk on Mar 29, 2008 at 01:30 UTC
I get `Benchmark: timing 100000 iterations of rollahead, unroll, unroll2... rollahead: 22.9776 wallclock secs (22.93 usr + 0.02 sys = 22.95 CPU) + @ 4357.30/s (n=100000) unroll: 23.0715 wallclock secs (23.03 usr + 0.02 sys = 23.05 CPU) + @ 4338.39/s (n=100000) unroll2: 22.3888 wallclock secs (22.35 usr + 0.03 sys = 22.38 CPU) + @ 4468.28/s (n=100000) Rate unroll rollahead unroll2 unroll 4338/s -- -0% -3% rollahead 4357/s 0% -- -2% unroll2 4468/s 3% 3% --` [download] which still rates rollahead as faster than unroll, but barely. Interesting that runs on different machines can produce a 15% change in relative performance. (I am running perl 5.8.6.) unroll2 is unroll with a more efficient regexp (one fewer alternation). That it is slightly faster suggests that the alternation is indeed slowing things down. `sub unroll2 { my @x = $_[0] =~ /(?:^\|@)((?:#[#@]\|[^#@])*)/g; for(@x){ $_ =~ s/##/#/g; $_ =~ s/#@/@/g; } @x }` [download]	[reply] [d/l] [select]
Re^6: Splitting on escapable delimiter by mobiusinversion (Beadle) on Mar 29, 2008 at 07:14 UTC
Entirely excellent. Wow, I have enjoyed this thread very much. Thank you.	[reply]