Emptying (and refilling) an array.

Recently I was looking for the fastest way to clear an array I was using as a buffer. It had a specific length, it needed to have 'undef' for 'empty' values, and it may or may not have current values, plus it is going to be immediately refilled with 'real' values.

This isn't really something that's often needed: Anything is fast. But I was trying to squeeze every last bit of performance out of this code, so it mattered a bit.

Anyway, asking on the chatterbox got me several answers: Everyone seems to have an opinion, no one seems to know.

So I benchmarked. Here are the results for your trivia pleasure:

#!/usr/bin/perl

use strict;
use warnings;

use Benchmark qw(cmpthese);

my @working;
$#working = 100;

cmpthese( -10, {
    assign          => '@working = (); $working[$_] = $_ foreach ( 0..
+.100 );',
    assign_undef    => 'undef @working; $working[$_] = $_ foreach ( 0.
+..100 );',
    assign_length   => '@working = ()[0...100]; $working[$_] = $_ fore
+ach ( 0...100 );',
    using_x         => '@working = (undef) x 101; $working[$_] = $_ fo
+reach ( 0...100 );',
    x_with_value    => '@working = 1 x 101; $working[$_] = $_ foreach 
+( 0...100 );',
    } );
[download]

(The last is just to see if pre-filling with something approaching valid values makes a difference.)

Results:

Perl 5.10

                 Rate  using_x assign_undef x_with_value assign_length
+    assign
using_x       21456/s       --          -5%          -7%          -12%
+      -13%
assign_undef  22605/s       5%           --          -2%           -8%
+       -8%
x_with_value  23108/s       8%           2%           --           -5%
+       -6%
assign_length 24448/s      14%           8%           6%            --
+       -1%
assign        24637/s      15%           9%           7%            1%
+        --
[download]

Perl 5.8.5 (Different box.)

                 Rate  using_x assign_undef assign_length x_with_value
+    assign
using_x       35870/s       --          -1%           -1%          -2%
+       -4%
assign_undef  36133/s       1%           --           -0%          -1%
+       -4%
assign_length 36298/s       1%           0%            --          -1%
+       -3%
x_with_value  36540/s       2%           1%            1%           --
+       -3%
assign        37497/s       5%           4%            3%           3%
+        --
[download]

Note these results are about as stable as you would expect seeing those close numbers: That is, not very. 'Assign' and 'assign_undef' are generally the fastest, but expect pair switching between any pair on any run. (With occasional far-switching.)

I went with 'assign'.

Edit: Updated to fix some testing methodology problems noted by ikegami below. Thanks.

Comment on Emptying (and refilling) an array. Select or Download Code

Replies are listed 'Best First'.

Re: Emptying (and refilling) an array.
by Jenda (Abbot) on Jan 12, 2009 at 16:24 UTC

I think you are benchmarking something else than you said you are interested in. You said you have an array of some size and want to empty it somehow and later assign as many values as there were originally. And you're benchmarking creating a brand new array, assigning it some initial value and then assigning the values. That looks like a very different thing. Plus 100 is a fairly small number.

The attached benchmark suggests that it's best to just empty the array by assigning an empty list and if the number of elements is big, resize the array afterwards.

use Benchmark qw(cmpthese);

my $size = $ARGV[0] || 1000;

my @array = (1..$size);

sub undef_it {
    undef @array;
    $array[$_]=$_ foreach (0..$size-1);
    return;
}

sub empty_it {
    @array = ();
    $array[$_]=$_ foreach (0..$size-1);
    return;
}

sub empty_elements {
    undef $_ for @array;
    $array[$_]=$_ foreach (0..$size-1);
    return;
}

sub assign_list_of_undefs {
    @array = (undef) x $size;
    $array[$_]=$_ foreach (0..$size-1);
    return;
}

sub undef_resize {
    undef @array; $#array = $size;
    $array[$_]=$_ foreach (0..$size-1);
    return;
}

sub empty_resize {
    @array = (); $#array = $size;
    $array[$_]=$_ foreach (0..$size-1);
    return;
}

cmpthese( -10, {
    undef_it => \&undef_it,
    empty_it => \&empty_it,
    empty_elements => \&empty_elements,
    assign_list_of_undefs => \&assign_list_of_undefs,
    undef_resize => \&undef_resize,
    empty_resize => \&empty_resize,
});
[download]

Read more... (6 kB)

BTW, the ()[0...100] doesn't seem to work the way you seem to expect it. It doesn't seem to generate a list of 101 undefs, rather it generates an empty list:

@a = ()[0..50];
print $#a;
[download]

Jenda
Support Denmark!
Defend the free world!

[reply]
[d/l]
[select]

Re: Emptying (and refilling) an array.
by ikegami (Patriarch) on Jan 12, 2009 at 15:51 UTC

It looks like your trying to benchmark preallocation, but you failed at doing that. The array already has at least 101 elements allocated before you attempt to initialize it. The array is not freed from pass to pass. That means you're basically benchmarking the fastest way to do nothing. Obviously, actually doing nothing ("my @working;") will be at least as fast as everything else.

"my @working = undef" doesn't make sense. It assigns one element to the array. Maybe you were thinking of "undef @working" which frees the internal buffer of the array?

It's curious that assign_length and x_with_value preallocate 100 elements then proceed to store 101 elements. Not that it matters, because the array already has at least 101 elements from the previous pass of the loop.

It's curious that your didn't test "$#working = 100;" to preallocate the array. Not that it matters, because the array already has at least 101 elements from the previous pass of the loop.

"my @working = ();" can be written as just "my @working;"

[reply]
[d/l]
[select]

Re^2: Emptying (and refilling) an array.

by DStaal (Chaplain) on Jan 12, 2009 at 16:22 UTC

Thanks for the critique: I've updated my benchmarks with something hopefully a little better.

Yes, I was thinking of undef @working. Fixed.

I kept meaning to get those off-by-one errors matched up, but I knew it didn't actually make a difference... My bad.

$#working = 100; actually isn't part of what I was looking at: I was specifically looking at an array that (usually) already has a length.

my @working; is the same as my @working = (); when @working doesn't exist. Once it does, the latter does seem to actually clear the array. (As far as I can tell: My code that should throw errors if it didn't.)

[reply]
[d/l]
[select]

Re^3: Emptying (and refilling) an array.

by ikegami (Patriarch) on Jan 12, 2009 at 16:38 UTC

my @working; is the same as my @working = (); when @working doesn't exist. Once it does, the latter does seem to actually clear the array.

Not all. It's already cleared (but not deallocated) at the end of the last pass by the run-time effect of my. my @working = (); will never result in something different than my @working;.

(my @working = () if $x; is not necessarily the same as my @working if $x;, but that code is a buggy to begin with.)

Are you confusing my @working; with just @working;? I noticed you removed the "my". Removing the "my" is silly. It's not something you should do. At least add this test, please:

    'my'            => 'my @working; $working[$_] = $_ foreach ( 0...1
+00 );',
[download]

[reply]
[d/l]
[select]

Re^3: Emptying (and refilling) an array.

by JavaFan (Canon) on Jan 12, 2009 at 16:43 UTC

our @working;
cmpthese -20, {
   one => '@working = (); $working[$_] = $_ foreach 0 .. 100',
   ...
};
[download]

[reply]
[d/l]

Re: Emptying (and refilling) an array.
by Zen (Deacon) on Jan 12, 2009 at 19:46 UTC

[reply]

Re: Emptying (and refilling) an array.
by JavaFan (Canon) on Jan 12, 2009 at 15:28 UTC

0...100

0..100

[reply]
[d/l]
[select]

Re^2: Emptying (and refilling) an array.

by DStaal (Chaplain) on Jan 12, 2009 at 15:46 UTC

Mostly because I like the looks of it better. The difference in this case is immaterial. (One element.)

[reply]

Re^3: Emptying (and refilling) an array.

by JavaFan (Canon) on Jan 12, 2009 at 15:57 UTC

...

..

... is only different from .. in scalar context. (Or, if you want to be a doc lawyer, you could point out that perlop doesn't actually define the behaviour of ... in list context - but in such a case, you shouldn't use it either ;-))

[reply]
[d/l]
[select]

Re^3: Emptying (and refilling) an array.

by massa (Hermit) on Jan 12, 2009 at 16:01 UTC

...

"range" flip-flop operator

[]s, HTH, Massa (κς,πμ,πλ)

[reply]
[d/l]