http://qs1969.pair.com?node_id=277786


in reply to append to a string or make an array? (memory and time efficiency)

I'm not sure that this is exactly a great benchmark, but it seems fairly clear that appending to a string is a generally a bit quicker than pushing to an array.

#! perl -slw use strict; use Benchmark qw[cmpthese]; cmpthese( -3, { append1k => q[ my $buffer .= 'x' x 1024 for 0..5; ], push1k => q[ my @buffer; push @buffer, 'X' x 1024 for 0..5; ], }); cmpthese( -3, { append8k => q[ my $buffer .= 'x' x 8192 for 0..5; ], push8k => q[ my @buffer; push @buffer, 'X' x 8192 for 0..5; ], }); cmpthese( -3, { append64k => q[ my $buffer .= 'x' x 65536 for 0..5; ], push64k => q[ my @buffer; push @buffer, 'X' x 65536 for 0..5; ], });
P:\test>test Rate push1k append1k push1k 4816/s -- -26% append1k 6472/s 34% -- Rate push8k append8k push8k 982/s -- -9% append8k 1078/s 10% -- Rate push64k append64k push64k 43.4/s -- -33% append64k 65.3/s 50% --

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Replies are listed 'Best First'.
Re: append to a string or make an array? (memory and time efficiency)
by Abigail-II (Bishop) on Jul 25, 2003 at 08:06 UTC
    I think your benchmark is flawed. I don't think your append code is doing what you think it's doing. Witness:
    my $buffer .= "x" x 1024 for 0 .. 5; print length $buffer;

    $buffer is uninitialized by the time it reached the print. Moving the declaration away from the assignment fixes this.

    Furthermore, both fragments don't produce the same result. The 'push' variant keeps everything in an array - it ought to have final join, IMO.

    Here's a somewhat different benchmark. It also shows the append method to be faster, and much faster than I expected it to be.

    #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our $str; our ($a, $b); for our $t (5, 10, 100) { for my $n (1, 8, 64) { $str = "x" x ($n * 1024); cmpthese -3, { "append${n}k_$t" => '$::a = ""; $::a .= $::str for 1 .. $: +:t;', "push${n}k_$t" => 'my @s; push @s => $::str for 1 .. $: +:t; $::b = join "" => @s', }; die "Benchmark is wrong\n" unless length ($a) == $t * length $ +str && length ($b) == $t * length $ +str; } } __END__ Benchmark: running append1k_5, push1k_5 for at least 3 CPU seconds... Rate push1k_5 append1k_5 push1k_5 91330/s -- -58% append1k_5 219880/s 141% -- Benchmark: running append8k_5, push8k_5 for at least 3 CPU seconds... Rate push8k_5 append8k_5 push8k_5 33834/s -- -50% append8k_5 67846/s 101% -- Benchmark: running append64k_5, push64k_5 for at least 3 CPU seconds.. +. Rate push64k_5 append64k_5 push64k_5 460/s -- -92% append64k_5 5705/s 1139% -- Benchmark: running append1k_10, push1k_10 for at least 3 CPU seconds.. +. Rate push1k_10 append1k_10 push1k_10 53627/s -- -60% append1k_10 135692/s 153% -- Benchmark: running append8k_10, push8k_10 for at least 3 CPU seconds.. +. Rate push8k_10 append8k_10 push8k_10 16317/s -- -67% append8k_10 48795/s 199% -- Benchmark: running append64k_10, push64k_10 for at least 3 CPU seconds +... Rate push64k_10 append64k_10 push64k_10 221/s -- -85% append64k_10 1510/s 583% -- Benchmark: running append1k_100, push1k_100 for at least 3 CPU seconds +... Rate push1k_100 append1k_100 push1k_100 5730/s -- -60% append1k_100 14175/s 147% -- Benchmark: running append8k_100, push8k_100 for at least 3 CPU seconds +... Rate push8k_100 append8k_100 push8k_100 185/s -- -84% append8k_100 1181/s 537% -- Benchmark: running append64k_100, push64k_100 for at least 3 CPU secon +ds... Rate push64k_100 append64k_100 push64k_100 22.5/s -- -76% append64k_100 94.7/s 320% --

    Abigail

      Good catch++.

      I'm not sure I agree with you about the final join though. The OP just needs to keep the 1 to 5 "few k" chunks somewhere until he has the set. At that point he is going to transmit them.

      He can just as easily write them as 5 seperate strings as 1, and avoid the need to concatenate them?

      I think the most efficient way would be to pre-extend to the array to max_chanks, and then pass and store references to the chunks avoiding any need to copy or concatenate. In isolation from the code that generates the chunks, passes them to the buffering code, and finally transmits them, the whole thing is an academic exercise.

      Even without my gaff, the benchmark was at best iffy.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

        Thanks for your help Broswer and Abigail.

        You are correct, I don't need to re-join the strings at the end, I would just modify my server to transmit each chunk in the array. However, I have no idea how long it'll take to send 5 chunks vs. 1, so the join may or may not be fair.

        Since my server is single threaded, I'll modify it to have the current user's output buffer a global variable. Any methods which produce output can just tack their output onto there (without any need for passing strings or even string references among subroutines).

Re: Re: append to a string or make an array? (memory and time efficiency)
by diotalevi (Canon) on Jul 25, 2003 at 04:08 UTC

    I would guess any difference is about copying - the more you can do to prevent perl from having to copy strings around the better off you're likely to have. My nearly published ESPPlus::Storage is an example of that, in as many places as possible it avoids making copies and also is sure to pass references around instead of plain strings.