in reply to Convert string to array - performance challenge

Update: Don't consider or use this. It is broken per the replies below!

If you really feel the need for speed (at the expense of a little clarity), add this to your becnhmark:

$begin = time(); for( 0 .. 20 ) { my @a = reverse map chop( $buffer), 0 .. length( $buffer ) -1; } printf( "reverse map chop consumed %.3f seconds\n", time() - $begin ); __END__ C:\test>junk47 split consumed 2.368 second(s) pack consumed 0.347 seconds pack and chr in map consumed 4.041 seconds unpack '{a1)* consumed 2.507 seconds substr for consumed 1.652 seconds reverse map chop consumed 0.477 seconds

Replies are listed 'Best First'.
Re^2: Convert string to array - performance challenge
by almut (Canon) on Apr 07, 2010 at 18:26 UTC

    Your benchmark is broken, because after the first iteration of for( 0 .. 20 ), $buffer will have been reduced to length zero (because of the chops).

    With that corrected, "reverse map chop" isn't any faster than the other solutions.

      Acknowledged.

Re^2: Convert string to array - performance challenge
by ikegami (Patriarch) on Apr 07, 2010 at 18:31 UTC
    You didn't take into account chop()'s destructive behaviour. For 20 of the 21 passes, you're splitting a zero-length string.
    Rate rev_chop regex unpack_C unpack_a split rev_chop 10.0/s -- -1% -3% -44% -50% regex 10.1/s 1% -- -2% -44% -50% unpack_C 10.3/s 3% 2% -- -43% -49% unpack_a 18.0/s 80% 77% 74% -- -11% split 20.2/s 102% 99% 95% 12% --
    use strict; use warnings; use Benchmark qw( cmpthese ); my %tests = ( split => q{ my @a = split //, $buf; }, regex => q{ my @a = $buf =~ /./sg; }, unpack_C => q{ my @a = map chr, unpack 'C*', $buf; }, unpack_a => q{ my @a = unpack '(a)*', $buf; }, rev_chop => q{ my @a = reverse map chop($buf), 1..length($buf); }, ); $_ = "use strict; use warnings; my \$buf = our \$buffer; $_ 1" for values(%tests); local our $buffer = "abcdef\x00ghik" x 10_000; cmpthese(-2, \%tests);
      Added one more test case
      my %tests = ( split => q{ my @a = split //, $buf; }, regex => q{ my @a = $buf =~ /./sg; }, unpack_C => q{ my @a = map chr, unpack 'C*', $buf; }, unpack_a => q{ my @a = unpack '(a)*', $buf; }, rev_chop => q{ my @a = reverse map chop($buf), 1..length($buf); }, chop => q{ my @a; $a[ $_ ] = chop $buffer for length($buffer) .. 0; + }, );
      And result was
      Rate rev_chop unpack_C split regex unpack_a chop rev_chop 17.6/s -- -6% -13% -20% -30% -100% unpack_C 18.7/s 6% -- -7% -15% -25% -100% split 20.2/s 15% 8% -- -8% -20% -100% regex 22.0/s 25% 17% 9% -- -13% -100% unpack_a 25.1/s 43% 34% 24% 14% -- -100% chop 68478/s 388556% 365714% 339175% 311691% 272542% --
      However I am yet to read "Benchmark" so someone please explain this output. :) UPDATE: Sorry! That was a silly mistake. Thanks almut.
         my @a; $a[ $_ ] = chop $buffer for length($buffer) .. 0;

        Unfortunately, "reversed" ranges like length($buffer) .. 0  don't work...

        print for 1..3; # 123 print for 3..1; # no output

        However, this would work, and is actually pretty fast (about the same as substr):

        my @a; for (my $i=length($buffer)-1; $i>=0; $i--) { $a[$i] = chop $buffer; } # or my @a; my $i = length($buffer); $a[--$i] = chop $buffer while $i>0;

        my @a; $a[ $_ ] = chop $buffer for length($buffer) .. 0;

        You have an off-by-one error. If you could do that, it should be

        my @a; $a[ $_ ] = chop $buffer for length($buffer)-1 .. 0;

        A simple fix for the already-described problem is to use negative indexing:

        my @a; $a[ -$_ ] = chop $buffer for 1..length($buffer);

      Acknowledged.