Recently I was looking for the fastest way to clear an array I was using as a buffer. It had a specific length, it needed to have 'undef' for 'empty' values, and it may or may not have current values, plus it is going to be immediately refilled with 'real' values.

This isn't really something that's often needed: Anything is fast. But I was trying to squeeze every last bit of performance out of this code, so it mattered a bit.

Anyway, asking on the chatterbox got me several answers: Everyone seems to have an opinion, no one seems to know.

So I benchmarked. Here are the results for your trivia pleasure:

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(cmpthese); my @working; $#working = 100; cmpthese( -10, { assign => '@working = (); $working[$_] = $_ foreach ( 0.. +.100 );', assign_undef => 'undef @working; $working[$_] = $_ foreach ( 0. +..100 );', assign_length => '@working = ()[0...100]; $working[$_] = $_ fore +ach ( 0...100 );', using_x => '@working = (undef) x 101; $working[$_] = $_ fo +reach ( 0...100 );', x_with_value => '@working = 1 x 101; $working[$_] = $_ foreach +( 0...100 );', } );

(The last is just to see if pre-filling with something approaching valid values makes a difference.)

Results:

Note these results are about as stable as you would expect seeing those close numbers: That is, not very. 'Assign' and 'assign_undef' are generally the fastest, but expect pair switching between any pair on any run. (With occasional far-switching.)

I went with 'assign'.

Edit: Updated to fix some testing methodology problems noted by ikegami below. Thanks.

Replies are listed 'Best First'.
Re: Emptying (and refilling) an array.
by Jenda (Abbot) on Jan 12, 2009 at 16:24 UTC

    I think you are benchmarking something else than you said you are interested in. You said you have an array of some size and want to empty it somehow and later assign as many values as there were originally. And you're benchmarking creating a brand new array, assigning it some initial value and then assigning the values. That looks like a very different thing. Plus 100 is a fairly small number.

    The attached benchmark suggests that it's best to just empty the array by assigning an empty list and if the number of elements is big, resize the array afterwards.

    use Benchmark qw(cmpthese); my $size = $ARGV[0] || 1000; my @array = (1..$size); sub undef_it { undef @array; $array[$_]=$_ foreach (0..$size-1); return; } sub empty_it { @array = (); $array[$_]=$_ foreach (0..$size-1); return; } sub empty_elements { undef $_ for @array; $array[$_]=$_ foreach (0..$size-1); return; } sub assign_list_of_undefs { @array = (undef) x $size; $array[$_]=$_ foreach (0..$size-1); return; } sub undef_resize { undef @array; $#array = $size; $array[$_]=$_ foreach (0..$size-1); return; } sub empty_resize { @array = (); $#array = $size; $array[$_]=$_ foreach (0..$size-1); return; } cmpthese( -10, { undef_it => \&undef_it, empty_it => \&empty_it, empty_elements => \&empty_elements, assign_list_of_undefs => \&assign_list_of_undefs, undef_resize => \&undef_resize, empty_resize => \&empty_resize, });

    BTW, the ()[0...100] doesn't seem to work the way you seem to expect it. It doesn't seem to generate a list of 101 undefs, rather it generates an empty list:

    @a = ()[0..50]; print $#a;

Re: Emptying (and refilling) an array.
by ikegami (Patriarch) on Jan 12, 2009 at 15:51 UTC

    It looks like your trying to benchmark preallocation, but you failed at doing that. The array already has at least 101 elements allocated before you attempt to initialize it. The array is not freed from pass to pass. That means you're basically benchmarking the fastest way to do nothing. Obviously, actually doing nothing ("my @working;") will be at least as fast as everything else.

    "my @working = undef" doesn't make sense. It assigns one element to the array. Maybe you were thinking of "undef @working" which frees the internal buffer of the array?

    It's curious that assign_length and x_with_value preallocate 100 elements then proceed to store 101 elements. Not that it matters, because the array already has at least 101 elements from the previous pass of the loop.

    It's curious that your didn't test "$#working = 100;" to preallocate the array. Not that it matters, because the array already has at least 101 elements from the previous pass of the loop.

    "my @working = ();" can be written as just "my @working;"

      Thanks for the critique: I've updated my benchmarks with something hopefully a little better.

      Yes, I was thinking of undef @working. Fixed.

      I kept meaning to get those off-by-one errors matched up, but I knew it didn't actually make a difference... My bad.

      $#working = 100; actually isn't part of what I was looking at: I was specifically looking at an array that (usually) already has a length.

      my @working; is the same as my @working = (); when @working doesn't exist. Once it does, the latter does seem to actually clear the array. (As far as I can tell: My code that should throw errors if it didn't.)

        my @working; is the same as my @working = (); when @working doesn't exist. Once it does, the latter does seem to actually clear the array.

        Not all. It's already cleared (but not deallocated) at the end of the last pass by the run-time effect of my. my @working = (); will never result in something different than my @working;.

        (my @working = () if $x; is not necessarily the same as my @working if $x;, but that code is a buggy to begin with.)

        Are you confusing my @working; with just @working;? I noticed you removed the "my". Removing the "my" is silly. It's not something you should do. At least add this test, please:

        'my' => 'my @working; $working[$_] = $_ foreach ( 0...1 +00 );',
        If you want to benchmark explicitely clearing of the array, you should do that - and not start with a "new" array (although, behind the scenes, the array isn't quite new) each time. Move the declaration of the array outside of the loops. Something like:
        our @working; cmpthese -20, { one => '@working = (); $working[$_] = $_ foreach 0 .. 100', ... };
        Of course, if you're going to fill in all the elements anyway, why bother clearing them?
Re: Emptying (and refilling) an array.
by Zen (Deacon) on Jan 12, 2009 at 19:46 UTC
    One more minor nit about benchmarking; using the same interpreter instance to run the code that benchmarks the interpreter isn't believable to me. Someone might tell me that behind the scenes these different array variables are using different data structures, but to the observer this is not clear. I'd rather see the interpreter invoked 5 times and run a piece of code than wonder if there is some sharing, collision, some optimization, or some other hijinx going on in the back room, so to speak.

    That being said, I love posts like these. Thanks.
Re: Emptying (and refilling) an array.
by JavaFan (Canon) on Jan 12, 2009 at 15:28 UTC
    I wonder, why 0...100, and not 0..100.

      Mostly because I like the looks of it better. The difference in this case is immaterial. (One element.)

        Maybe you shouldn't use ... as you seem to be thinking it behaves different from .., when it doesn't.

        ... is only different from .. in scalar context. (Or, if you want to be a doc lawyer, you could point out that perlop doesn't actually define the behaviour of ... in list context - but in such a case, you shouldn't use it either ;-))

        The difference is not immaterial, it's non-existant. (the one-element difference is only when ... is used as the "range" flip-flop operator)
        []s, HTH, Massa (κς,πμ,πλ)