comment on

I'm writing some code that manipulates strings and that should be fast. Several functions look like this:

$string_out = foo($string_in);
[download]

While benchmarking my code to speed it up, I've noticed that quite some time was wasted in string copies. Using references does improve speed.

Here is what I tried:

use strict;
use warnings;
use Benchmark qw(:all);
use constant COUNT => 200000;

my($str, $tmp, $len);

sub in1 ($) {
    $tmp = length($_[0]);
}

sub in2 ($) {
    $tmp = length(${$_[0]});
}

sub out1 () {
    return($str);
}

sub out2 () {
    return(\$str);
}

sub fun1 ($) {
    my $new = $_[0] . "x";
    return($new);
}

sub fun2 ($) {
    my $new = ${ $_[0] } . "x";
    return(\$new);
}

foreach $len (1, 10, 100) {
    $str = "A" x ($len * 1000);
    timethese(10, {
    "in1x$len"  => sub { for (1 .. COUNT) { in1($str) } },
    "in2x$len"  => sub { for (1 .. COUNT) { in2(\$str) } },
    "out1x$len" => sub { for (1 .. COUNT) { $tmp = out1() } },
    "out2x$len" => sub { for (1 .. COUNT) { $tmp = out2() } },
    "fun1x$len" => sub { for (1 .. COUNT) { $tmp = fun1($str) } },
    "fun2x$len" => sub { for (1 .. COUNT) { $tmp = fun2(\$str) } },
    });
}
[download]

Here are the results:

Benchmark: timing 10 iterations of fun1x1, fun2x1, in1x1, in2x1, out1x
+1, out2x1...
    fun1x1:  4 wallclock secs ( 4.31 usr +  0.01 sys =  4.32 CPU) @  2
+.31/s (n=10)
    fun2x1:  5 wallclock secs ( 4.92 usr +  0.00 sys =  4.92 CPU) @  2
+.03/s (n=10)
     in1x1:  2 wallclock secs ( 1.08 usr +  0.00 sys =  1.08 CPU) @  9
+.26/s (n=10)
     in2x1:  1 wallclock secs ( 1.48 usr +  0.00 sys =  1.48 CPU) @  6
+.76/s (n=10)
    out1x1:  3 wallclock secs ( 2.40 usr +  0.00 sys =  2.40 CPU) @  4
+.17/s (n=10)
    out2x1:  1 wallclock secs ( 1.58 usr +  0.00 sys =  1.58 CPU) @  6
+.33/s (n=10)
Benchmark: timing 10 iterations of fun1x10, fun2x10, in1x10, in2x10, o
+ut1x10, out2x10...
   fun1x10: 16 wallclock secs (15.72 usr +  0.04 sys = 15.76 CPU) @  0
+.63/s (n=10)
   fun2x10: 12 wallclock secs (12.33 usr +  0.01 sys = 12.34 CPU) @  0
+.81/s (n=10)
    in1x10:  1 wallclock secs ( 1.08 usr +  0.00 sys =  1.08 CPU) @  9
+.26/s (n=10)
    in2x10:  2 wallclock secs ( 1.47 usr +  0.00 sys =  1.47 CPU) @  6
+.80/s (n=10)
   out1x10:  6 wallclock secs ( 6.62 usr +  0.00 sys =  6.62 CPU) @  1
+.51/s (n=10)
   out2x10:  2 wallclock secs ( 1.59 usr +  0.00 sys =  1.59 CPU) @  6
+.29/s (n=10)
Benchmark: timing 10 iterations of fun1x100, fun2x100, in1x100, in2x10
+0, out1x100, out2x100...
  fun1x100: 119 wallclock secs (118.82 usr +  0.03 sys = 118.85 CPU) @
+  0.08/s (n=10)
  fun2x100: 89 wallclock secs (87.76 usr +  0.05 sys = 87.81 CPU) @  0
+.11/s (n=10)
   in1x100:  1 wallclock secs ( 1.10 usr +  0.00 sys =  1.10 CPU) @  9
+.09/s (n=10)
   in2x100:  1 wallclock secs ( 1.49 usr +  0.01 sys =  1.50 CPU) @  6
+.67/s (n=10)
  out1x100: 43 wallclock secs (43.27 usr +  0.04 sys = 43.31 CPU) @  0
+.23/s (n=10)
  out2x100:  2 wallclock secs ( 1.57 usr +  0.00 sys =  1.57 CPU) @  6
+.37/s (n=10)
[download]

As you can see, passing a string by reference seems to slow down (in2 is slower than in1) while returning it by reference (out2 versus out1) gives a big boost, especially with big strings. When combining both, most of the time is wasted in the string modification but the version by reference (fun2) is significantly faster than the direct one (fun1).

I can choose the API I want for my code but I also use other modules and they do not seem to allow ways to avoid string copies. The modules I use are Encode, MIME::Base64 and Compress::Zlib. The first two only work on strings while the last one does accept a reference as input but does not allow to get a reference to the output.

Hence my questions:

Are there other ways to avoid string copies besides using references?
Does it make sense to submit enhancement requests to these standard modules to have the possibility to get the result by reference?

In reply to How to avoid string copies in function calls? by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.