in reply to Seeking Audience with the Perl Regex Droids...

Firstly, you omitted the other contender:

... sub lookbehind { my $x = shift; split '(?:(?<=[^#]####)|(?<=[^#]##)|(?<!#))[@]', $x; } C:\test>junk8 Subroutine Benchmark::mytime redefined at c:/Perl/lib/Benchmark.pm lin +e 459. Benchmark: timing 10000 iterations of lookbehind, rollahead, unroll ... lookbehind: 2.34183 wallclock secs ( 2.19 usr + 0.00 sys = 2.19 CPU) + @ 4570.38/s (n=10000) rollahead: 3.46343 wallclock secs ( 3.23 usr + 0.00 sys = 3.23 CPU) + @ 3092.15/s (n=10000) unroll: 3.51775 wallclock secs ( 3.33 usr + 0.02 sys = 3.34 CPU) + @ 2990.43/s (n=10000) Rate unroll rollahead lookbehind unroll 2990/s -- -3% -35% rollahead 3092/s 3% -- -32% lookbehind 4570/s 53% 48% --

Secondly, you didn't check your results. unroll and rollahead produce different results:

rollahead => @# #@## ##@ ##@## #@# # @# #@## ##@@# #@## ##@ +@# #@ ... unroll => @# #@## ##@ ##@## #@# @# #@## ##@@# #@## ##@@ +# #@# ...

And both sets are wrong! Any output that has an even number of #s preceding a @ (as with the 3rd, 4th etc. outputs above) is wrong. If you look back at the op, you'll see that any @s that persist into the results set are preceded by an odd number of ##s.

The results set from the fastest algorithm:

lookbehind => #@## ###@#### #####@ #####@#### ###@## # #@## + ###@#### ...

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Seeking Audience with the Perl Regex Droids...
by mobiusinversion (Beadle) on Mar 29, 2008 at 05:13 UTC
    Your solution is actually incorrect on parsing any string which has a consecutive stretch of pound signs (ie '#') that is a multiple of 4 and greater than or equal to 8.

    Try this:
    my $x = '@########@';
    it should parse as:
    '', '####', '',
    Your solution (as you've hard coded it for one example, and admittedly so in your post), returns:
    '', '###@'
    which would have been delimited as:
    '@#######@'
    That is, seven pound signs...

    I ask you to please consider the original post as I dont think you are making your assertions with an understanding of what was asked. For example, the escape character itself must be escaped when created a delimited line. Hence the list:
    my @x = qw(## @ #);
    would be delimited as:
    '####@#@@##'
    Finally, I would be glad to verify your second claim if you can provide the input string upon which rollahead and unroll return different results. I believe such a string does not exist.
      Finally, I would be glad to verify your second claim if you can provide the input string upon which rollahead and unroll return different results. I believe such a string does not exist.

      Actually, it was your own benchmark:

      #! perl -slw use strict; use Benchmark ':all', ':hireswallclock'; my $x = "#@##@###@####@#####@"; my $y = reverse $x; my $z = "$x$x$x$y$y$x$x$y$y$y$y$y$x$x$x$x"; our %results; cmpthese -3, { unroll => sub { my @x = ([unroll($x)],[unroll($y)],[unroll($z)]); $results{ unroll } = \@x; }, rollahead => sub { my @x = ([rollahead($x)],[rollahead($y)],[rollahead($z)]); $results{ rollahead } = \@x; }, lookbehind => sub { my @x = ([lookbehind($x)],[lookbehind($y)],[lookbehind($z)]); $results{ lookbehind } = \@x; }, }; printf "%20s => %s\n", $_, "@{[ map{ qq[@$_] } @{ $results{ $_ } } ]} +" for keys %results; sub unroll { my @x = $_[0] =~ /(?:^|@)((?:##|#@|[^#@])*)/g; for(@x){ $_ =~ s/##/#/g; $_ =~ s/#@/@/g; } @x } sub rollahead { my $x = shift; $x = reverse $x; my @x = reverse(split/\@(?=(?:##)*(?!#))/,$x,-1); for(@x){ $_ = reverse; $_ =~ s/##/#/g; $_ =~ s/#@/@/g; } @x } sub lookbehind { my $x = shift; split m[ (?: (?<=[^#]\#\#\#\#) | (?<=[^#]\#\#) | (?<!\#) ) [@] ]x, + $x; }

      Output (read'em and weep:):

      C:\test>junk8 Rate unroll rollahead lookbehind unroll 3096/s -- -3% -33% rollahead 3175/s 3% -- -32% lookbehind 4639/s 50% 46% -- rollahead => @# #@## ##@ ##@## #@# # @# #@## ##@@# #@## ## +@@# #@## ##@ ##@## #@# @##@## #@# # # #@## ##@@# #@## ##@ ##@## #@# @ +##@## #@# @##@## #@# @##@## #@# @##@## #@# # # #@## ##@@# #@## ##@@# +#@## ##@@# #@## ##@ unroll => @# #@## ##@ ##@## #@# @# #@## ##@@# #@## ##@ +@# #@## ##@ ##@## #@# @##@## #@# # # #@## ##@@# #@## ##@ ##@## #@# @# +#@## #@# @##@## #@# @##@## #@# @##@## #@# # # #@## ##@@# #@## ##@@# # +@## ##@@# #@## ##@ lookbehind => #@## ###@#### #####@ #####@#### ###@## # #@## + ###@#### #####@#@## ###@#### #####@#@## ###@#### #####@ #####@#### # +##@## #@#####@#### ###@## ## ## ###@#### #####@#@## ###@#### #####@ # +####@#### ###@## #@#####@#### ###@## #@#####@#### ###@## #@#####@#### + ###@## #@#####@#### ###@## ## ## ###@#### #####@#@## ###@#### #####@ +#@## ###@#### #####@#@## ###@#### #####@

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.