in reply to Convert string to array - performance challenge

For that input, split is the fastest for me.
Rate regex unpack_C unpack_a split regex 9.95/s -- -2% -46% -50% unpack_C 10.1/s 2% -- -45% -49% unpack_a 18.3/s 84% 80% -- -9% split 20.0/s 101% 97% 9% --
use strict; use warnings; use Benchmark qw( cmpthese ); my %tests = ( split => q{ my @a = split //, $buf; }, regex => q{ my @a = $buf =~ /./sg; }, unpack_C => q{ my @a = map chr, unpack 'C*', $buf; }, unpack_a => q{ my @a = unpack '(a)*', $buf; }, ); $_ = "use strict; use warnings; our \$buf; $_ 1" for values(%tests); local our $buf = "abcdef\x00ghik" x 10_000; cmpthese(-2, \%tests);

Update: Changed (A)* to (a)* in response to BrowserUk's reply.

Replies are listed 'Best First'.
Re^2: Convert string to array - performance challenge
by BrowserUk (Patriarch) on Apr 07, 2010 at 17:55 UTC

    There is a problem with using unpack '(A)*':

    [0] Perl> $buf = "abcdef\x00ghik";; [0] Perl> @a = unpack '(A)*', $buf;; [0] Perl> print length for @a;; 1 1 1 1 1 1 0 1 1 1 1
Re^2: Convert string to array - performance challenge
by Marshall (Canon) on Apr 08, 2010 at 06:52 UTC
    Very interesting! With the EXACT same code as Ikegami, (with the (a)* mod), I get:
    Rate regex split unpack_C unpack_a regex 3.00/s -- -3% -9% -19% split 3.09/s 3% -- -6% -17% unpack_C 3.29/s 10% 7% -- -11% unpack_a 3.71/s 24% 20% 13% --
    I have seen before that the Ikegami machine is faster than mine on certain benchmarks. But, I am unable to understand how split() can be the fastest. I am running a Prescott processor, an old 3 Ghz hyper-threaded critter. With Active State Perl 5.10.1.
      I think it depends a lot on the system, c-runtime
      ActivePerl v5.8.9 built for MSWin32-x86-multi-thread, Binary build 825 + [288577] Rate unpack_C unpack_a split regex unpack_C 3.48/s -- -5% -7% -9% unpack_a 3.66/s 5% -- -3% -4% split 3.76/s 8% 3% -- -1% regex 3.82/s 10% 4% 1% -- Rate unpack_C regex unpack_a split unpack_C 3.56/s -- -5% -5% -6% regex 3.74/s 5% -- -0% -1% unpack_a 3.74/s 5% 0% -- -1% split 3.79/s 7% 2% 1% -- Rate split unpack_C regex unpack_a split 3.45/s -- -4% -9% -25% unpack_C 3.58/s 4% -- -6% -22% regex 3.79/s 10% 6% -- -18% unpack_a 4.60/s 34% 29% 21% --
      The newer versions (built using MinGW) are more consistent between runs
      C:\perl\5.10.1\bin\MSWin32-x86-multi-thread\perl.exe Rate unpack_C regex split unpack_a unpack_C 3.66/s -- -2% -15% -22% regex 3.74/s 2% -- -14% -21% split 4.33/s 18% 16% -- -8% unpack_a 4.70/s 28% 26% 9% -- C:\perl\5.11.1\bin\MSWin32-x86-multi-thread\perl.exe Rate regex unpack_C split unpack_a regex 2.74/s -- -28% -38% -39% unpack_C 3.82/s 39% -- -13% -14% split 4.40/s 60% 15% -- -2% unpack_a 4.47/s 63% 17% 2% --

        I see some variablility, but nothing like those levels using AS perls.

        AS1007 64-bit:

        c:\test>junk Rate regex unpack_C unpack_a split regex 7.95/s -- -1% -10% -14% unpack_C 8.01/s 1% -- -10% -14% unpack_a 8.88/s 12% 11% -- -4% split 9.30/s 17% 16% 5% -- c:\test>junk Rate regex unpack_C unpack_a split regex 7.95/s -- -0% -8% -12% unpack_C 7.95/s 0% -- -8% -12% unpack_a 8.68/s 9% 9% -- -4% split 9.09/s 14% 14% 5% -- c:\test>junk Rate unpack_C regex unpack_a split unpack_C 8.02/s -- -0% -9% -11% regex 8.02/s 0% -- -9% -11% unpack_a 8.81/s 10% 10% -- -2% split 9.02/s 13% 13% 2% -- c:\test>junk Rate unpack_C regex unpack_a split unpack_C 7.95/s -- -0% -9% -13% regex 7.95/s 0% -- -9% -13% unpack_a 8.74/s 10% 10% -- -5% split 9.16/s 15% 15% 5% --

        AS826 32-bit:

        c:\test>\perl32\bin\perl junk.pl Rate unpack_C split regex unpack_a unpack_C 8.38/s -- -5% -9% -15% split 8.81/s 5% -- -5% -11% regex 9.23/s 10% 5% -- -6% unpack_a 9.86/s 18% 12% 7% -- c:\test>\perl32\bin\perl junk.pl Rate unpack_C split regex unpack_a unpack_C 8.45/s -- -5% -8% -15% split 8.87/s 5% -- -3% -11% regex 9.16/s 8% 3% -- -8% unpack_a 9.94/s 18% 12% 9% -- c:\test>\perl32\bin\perl junk.pl Rate unpack_C split regex unpack_a unpack_C 8.38/s -- -5% -8% -14% split 8.81/s 5% -- -4% -10% regex 9.16/s 9% 4% -- -6% unpack_a 9.78/s 17% 11% 7% -- c:\test>\perl32\bin\perl junk.pl Rate unpack_C split regex unpack_a unpack_C 8.38/s -- -5% -8% -14% split 8.81/s 5% -- -4% -10% regex 9.16/s 9% 4% -- -6% unpack_a 9.78/s 17% 11% 7% --

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.