My day job involved lots of Java and C# for a long time. One of the things that was drilled into me was to avoid simple string concatenation (+=) within a loop -- always use some kind of stream or buffer. I found myself searching for some kind of fast string builder on CPAN, but I didn't find much. IO::String led me to the fact that I could pass a scalar ref instead of a filename to open. That sounded a lot like a familiar MemoryStream. :-) I ran some quick benchmarks, but low-and-behold, the .= operator appears to be magically quick.

Google and friends have failed me in finding a discussion of this surprising (to me) speed. There's lots of discussion of concatenation versus interpolation, but not very much about potentially faster alternatives to concatenation. I guess because it's so fast!

I've included my test at the bottom. If my methodology is incorrect, please let me know. When the string to be concatenated was clearly constant, I think the compiler was having a little fun with me. Once I wrapped it in a function call I came up with these results:

Rate filehandle_OO 484/s pushing 531/s filehandle 785/s concatenation 929/s

Based on the emphasis on Stream-based processing in the languages of my prior experience, my intuition trembles to discover that .= outperforms the filehandle (and, of course, destroys the OO filehandle).

What's going on? Why is string concatenation the fastest thing I've found? Is it somehow avoiding the making-repeated-copies-of-the-same-data trap that Java and C#'s String += operators have?

#!/usr/bin/perl use strict; use warnings; use Benchmark ':all'; use English '-no-match-vars'; use IO::Handle; my $long_string = '.' x 200; sub get_long_string { return $long_string; } sub concatenation { my $output = q{}; $output .= get_long_string() for ( 1 .. 1000 ); return $output; } sub filehandle { my $output = q{}; open my $fh, '>', \$output or die "Huh?! $OS_ERROR"; print $fh get_long_string() for ( 1 .. 1000 ); close $fh; return $output; } sub filehandle_OO { my $output = q{}; open my $fh, '>', \$output or die "Huh?! $OS_ERROR"; $fh->print( get_long_string() ) for ( 1 .. 1000 ); $fh->close(); return $output; } sub pushing { my @output; push @output, get_long_string() for ( 1 .. 1000 ); return join( '', @output ); } cmpthese( 10_000, { 'concatenation' => \&concatenation, 'filehandle' => \&filehandle, 'filehandle_OO' => \&filehandle_OO, 'pushing' => \&pushing } );

In reply to Is there something faster than string concatenation? by rdj

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.