Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: append to a string or make an array? (memory and time efficiency)

by Abigail-II (Bishop)
on Jul 25, 2003 at 08:06 UTC ( [id://277814]=note: print w/replies, xml ) Need Help??

in reply to Re: append to a string or make an array? (memory and time efficiency)
in thread append to a string or make an array? (memory and time efficiency)

I think your benchmark is flawed. I don't think your append code is doing what you think it's doing. Witness:
my $buffer .= "x" x 1024 for 0 .. 5; print length $buffer;

$buffer is uninitialized by the time it reached the print. Moving the declaration away from the assignment fixes this.

Furthermore, both fragments don't produce the same result. The 'push' variant keeps everything in an array - it ought to have final join, IMO.

Here's a somewhat different benchmark. It also shows the append method to be faster, and much faster than I expected it to be.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our $str; our ($a, $b); for our $t (5, 10, 100) { for my $n (1, 8, 64) { $str = "x" x ($n * 1024); cmpthese -3, { "append${n}k_$t" => '$::a = ""; $::a .= $::str for 1 .. $: +:t;', "push${n}k_$t" => 'my @s; push @s => $::str for 1 .. $: +:t; $::b = join "" => @s', }; die "Benchmark is wrong\n" unless length ($a) == $t * length $ +str && length ($b) == $t * length $ +str; } } __END__ Benchmark: running append1k_5, push1k_5 for at least 3 CPU seconds... Rate push1k_5 append1k_5 push1k_5 91330/s -- -58% append1k_5 219880/s 141% -- Benchmark: running append8k_5, push8k_5 for at least 3 CPU seconds... Rate push8k_5 append8k_5 push8k_5 33834/s -- -50% append8k_5 67846/s 101% -- Benchmark: running append64k_5, push64k_5 for at least 3 CPU seconds.. +. Rate push64k_5 append64k_5 push64k_5 460/s -- -92% append64k_5 5705/s 1139% -- Benchmark: running append1k_10, push1k_10 for at least 3 CPU seconds.. +. Rate push1k_10 append1k_10 push1k_10 53627/s -- -60% append1k_10 135692/s 153% -- Benchmark: running append8k_10, push8k_10 for at least 3 CPU seconds.. +. Rate push8k_10 append8k_10 push8k_10 16317/s -- -67% append8k_10 48795/s 199% -- Benchmark: running append64k_10, push64k_10 for at least 3 CPU seconds +... Rate push64k_10 append64k_10 push64k_10 221/s -- -85% append64k_10 1510/s 583% -- Benchmark: running append1k_100, push1k_100 for at least 3 CPU seconds +... Rate push1k_100 append1k_100 push1k_100 5730/s -- -60% append1k_100 14175/s 147% -- Benchmark: running append8k_100, push8k_100 for at least 3 CPU seconds +... Rate push8k_100 append8k_100 push8k_100 185/s -- -84% append8k_100 1181/s 537% -- Benchmark: running append64k_100, push64k_100 for at least 3 CPU secon +ds... Rate push64k_100 append64k_100 push64k_100 22.5/s -- -76% append64k_100 94.7/s 320% --


Replies are listed 'Best First'.
Re: Re: append to a string or make an array? (memory and time efficiency)
by BrowserUk (Patriarch) on Jul 25, 2003 at 10:25 UTC

    Good catch++.

    I'm not sure I agree with you about the final join though. The OP just needs to keep the 1 to 5 "few k" chunks somewhere until he has the set. At that point he is going to transmit them.

    He can just as easily write them as 5 seperate strings as 1, and avoid the need to concatenate them?

    I think the most efficient way would be to pre-extend to the array to max_chanks, and then pass and store references to the chunks avoiding any need to copy or concatenate. In isolation from the code that generates the chunks, passes them to the buffering code, and finally transmits them, the whole thing is an academic exercise.

    Even without my gaff, the benchmark was at best iffy.

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

      Thanks for your help Broswer and Abigail.

      You are correct, I don't need to re-join the strings at the end, I would just modify my server to transmit each chunk in the array. However, I have no idea how long it'll take to send 5 chunks vs. 1, so the join may or may not be fair.

      Since my server is single threaded, I'll modify it to have the current user's output buffer a global variable. Any methods which produce output can just tack their output onto there (without any need for passing strings or even string references among subroutines).

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://277814]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-04-25 14:42 GMT
Find Nodes?
    Voting Booth?

    No recent polls found