rrwo has asked for the wisdom of the Perl Monks concerning the following question:

Here's a confusing bit of code (below). I'm trying to figure out why it's faster to use split on a string than to pass an anonymous array to a function. (At least that's the case with ActivePerl Build 623.)

Is dereferencing that much of a slowdown?!?


use strict;

use Time::HiRes qw(time);

{

  my $mark = time();;

  sub mark_time() { $mark = time(); }

  sub get_time() {
    return time() - $mark;
  }

  sub show_time() {
    print "Time elapsed since last mark: ",
      sprintf('%1.4f', get_time() ),
      " seconds.\n";
  }

}

use constant COUNT => 1000000;

my $Regexp = qr/,/;

sub try_split
  {
    my ($arg) = @_;
    return split $Regexp, $arg;
  }

sub try_arrayref
  {
    my ($arg) = @_;
    return @$arg;
  }

print "split: ";

mark_time();

for (my $i=0; $i<COUNT; $i++)
  {
    my ($X, $Y) = try_split 'A,B';
  }

show_time();

print "array reference: ";

mark_time();

for (my $i=0; $i<COUNT; $i++)
  {
    my ($X, $Y) = try_arrayref  qw(A B) ;
  }

show_time();


  • Comment on Why is split faster than anonymous arrays?

Replies are listed 'Best First'.
Re: Why is split faster than anonymous arrays?
by extremely (Priest) on Jan 21, 2001 at 03:00 UTC
    Since tye didn't say anything about it in his post, I'd like to point out that you might wish to look at his use of Benchmark. Honestly, if you are going to do 100000 of something or so, Time::HiRes doesn't buy you much in accuracy.

    If you are going to start timing stuff on a regular basis use Benchmark and enjoy the benefits of others hard labor. Since I love these sorts of comparisons, I keep a "testit" file around that looks like this:

    #!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); #cmpthese gives a nice comparison table use vars qw($s @a $h); # I so that data is visible in evals/subs for s +ure # $s = shift; # test scalar (ARGV) # @a = <DATA>; # test array # %h = map { split/=>/ } <DATA> # test hash cmpthese ( -10, #ten seconds rather than X number of loops { 'one' => '', # eval string style 'two' => sub {}, # sub ref style }); __DATA__

    With the above I can cut-n-paste in code from various places and give it a whirl in no-time. I do recommend that you stick with all sub references or all evals tho, just to keep your sanity.

    --
    $you = new YOU;
    honk() if $you->love(perl)

      (Somehow my code was cut off...)

      The issue isn't how to benchmark it so much as in "real world testing" using array references seems to be over 10% slower than using split in Perl 5.6 using ActiveState's Perl for Windows.

      This does not make sense to me. Wouldn't it be more work to split a string than to dereference an array?

        From what I see of the code, in one you are moving a string in and splitting and returning a list; and in the other you are dereferencing an array ref and making a new list from that array's elements.

        Breaking it down further, I'd say what you are really benchmarking is the speed difference between creating a new set of scalars and upping the refcount on an existing set of scalars.

        Both move a single scalar, both create and return a list, but one creates new scalars and another must run an array down and return all it's guts after changing each one. Running the benchmark below, I get split losing just barely on linux but the difference varied from 1-5% so I'd say that they are basically equivalent Benchmark wise. I'd say the difference you see is either a poorer split implementation or a variation in how the OS's deal with memory allocation and paging. And, I'd stop worrying about it and start coding the way you like.

        #!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); use vars qw($d $e @t); $d = <DATA>; my @e = split " ", $d; $e= \@e; sub sss { my $q=shift; return (split m/ /, $q) } sub aaa {my $q=shift; return @$q} print $d,$/,sss($d),$/,aaa($e),$/; cmpthese ( -10, { split=> '@t = sss($d)', arref=> '@t = aaa($e)', }); __DATA__ a b c d e f g h i j k l m n o p q r s t u v w x y z

        --
        $you = new YOU;
        honk() if $you->love(perl)

(tye)Re: Why is split faster than anonymous arrays?
by tye (Sage) on Jan 21, 2001 at 01:46 UTC

    I get the array ref faster than the split, though sometimes only marginally so. It appears that you code is incomplete. Here is the code I used:

    #!/usr/bin/perl -w use strict; use Benchmark qw(timethese); my $Regexp = qr/,/; sub try_split { my ($arg) = @_; return split $Regexp, $arg; } sub try_arrayref { my ($arg) = @_; return @$arg; } my @arr= '00'..'99'; #my @arr= '0'..'9'; #my @arr= '0'..'3'; my $str= join ",", @arr; my @x; timethese( -3, { split=>sub{@x= try_split($str)}, aref=>sub{@x= try_arrayref(\@arr)} } );
            - tye (but my friends call me "Tye")