http://qs1969.pair.com?node_id=1143930

capfan has asked for the wisdom of the Perl Monks concerning the following question:

Dear all,

while meditation over the vast amounts of time needed by Text::Overlaps to calculate some sort of longest common prefixes, I encountered a strange pattern.

A variable is instantiated using the string concatenation operator: my $str1 .= $self->sanitizeString ($input1);

Is it some sort of typo, that simply works because Perl can handle it?
Or is there some hidden speedup magic behind this kind of variable instantiation?

Speed differences seem to be in scope of measurement inaccuracies / noise:

#!perl use strict; use warnings; use 5.020; use Benchmark; my $s1 .= 'Peter geht nach Hause geht'; my $s2 = 'Peter nach Hause geht'; ## Method number one - a numeric sort sub test_concat { my $string = shift; my $s1x .= $string; return $s1x; } ## Method number two - an alphabetic sort sub test_normal { my $string = shift; my $s1x = $string; return $s1x; } ## We'll test each one, with simple labels my $count = 1000000; timethese ( $count, { 'Method One' => sub{ test_concat($s1); }, 'Method Two' => sub{ test_normal($s1); }, } ); exit(0);

Result:

Benchmark: timing 1000000 iterations of Method One, Method Two... Method One: 4 wallclock secs ( 4.13 usr + 0.00 sys = 4.13 CPU) @ 24 +2424.24/s (n=1000000) Method Two: 5 wallclock secs ( 4.39 usr + 0.00 sys = 4.39 CPU) @ 22 +7738.56/s (n=1000000)

Peeking at what Perl does when instantiating a variable one way or another:

>perl -MO=Concise script-concat.pl 6 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 4 script-concat.pl:7) v:*,&,{,x*,x&,x$ ,$,201328640 ->3 5 <2> concat[t2] vKS/2 ->6 3 <0> padsv[$s1:4,5] sRM/LVINTRO ->4 4 <$> const[PV "Peter geht nach Hause geht"] s ->5 script-concat.pl syntax OK >perl -MO=Concise script-assign.pl e-variable-instantiation.pl 6 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 4 script-assign.pl:7) v:*,&,{,x*,x&,x$ ,$,201328640 ->3 5 <2> sassign vKS/2 ->6 3 <$> const[PV "Peter geht nach Hause geht"] s ->4 4 <0> padsv[$s1:4,5] sRM*/LVINTRO ->5 script-assign.pl syntax OK
there is a tiny difference (concat vs. sassign).

Is it worth to put up a patch to change the operator? Or can it be neglected?

Replies are listed 'Best First'.
Re: Typo or on purpose? Variable instantiation with string concatenation operator
by ikegami (Patriarch) on Oct 06, 2015 at 15:23 UTC
    my $str1 .= $self->sanitizeString($input1);

    will do the same thing as

    my $str1 = $self->sanitizeString($input1);

    but it makes no sense to use the former.

    The benchmarks show no difference in speed. (Anything under 1% is definitely meaningless. I question anything under 5%. I used kennethk's code, but changed timethese to cmpthese to produce more useful output, and changed the label to something meaningful.)

    Rate concat normal concat 771935/s -- -1% normal 779433/s 1% --
      Not relevant for this specific case, but there is a difference if the thing being appended is a number:
      $ perl -e'use Devel::Peek; my $x = 5; Dump($x) ; my $y .= 5; Dump($y); +' SV = IV(0x2fc84) at 0x2fc88 REFCNT = 1 FLAGS = (PADMY,IOK,pIOK) IV = 5 SV = PV(0x13838) at 0x2fcd8 REFCNT = 1 FLAGS = (PADMY,POK,pPOK) PV = 0x2ac18 "5"\0 CUR = 1 LEN = 12

      i.e. .= will stringify on assigment.

        Good point. I'd use the following if I wanted to force stringification:
        my $str1 = "".$self->sanitizeString($input1);
Re: Typo or on purpose? Variable instantiation with string concatenation operator
by kennethk (Abbot) on Oct 06, 2015 at 15:41 UTC
    An undefined value is converted to an empty string in string context. So the two are functionally equivalent, even if one is potentially confusing.

    The benchmark you wrote is not a good test because the concatenation time will be lost in the overhead for the function call. To some degree, you could argue that means, by definition, that it doesn't matter which you pick; however, if you are going to test, you should test the right thing.

    #!perl use strict; use warnings; use Benchmark; my $s1 .= 'Peter geht nach Hause geht'; my $s2 = 'Peter nach Hause geht'; ## Method number one - a numeric sort sub test_concat { my $string = shift; my $s1x; for (1 .. 100000) { $s1x .= $string; undef $s1x; } return $s1x; } ## Method number two - an alphabetic sort sub test_normal { my $string = shift; my $s1x; for (1 .. 100000) { $s1x = $string; undef $s1x; } return $s1x; } ## We'll test each one, with simple labels my $count = 100; timethese ( $count, { 'Method One' => sub{ test_concat($s1); }, 'Method Two' => sub{ test_normal($s1); }, } ); exit(0);
    which yields
    Benchmark: timing 100 iterations of Method One, Method Two... Method One: 4 wallclock secs ( 3.40 usr + 0.00 sys = 3.40 CPU) @ 29 +.40/s (n=100) Method Two: 3 wallclock secs ( 3.59 usr + 0.00 sys = 3.59 CPU) @ 27 +.87/s (n=100)
    or
    Benchmark: timing 100 iterations of Method One, Method Two... Method One: 3 wallclock secs ( 3.37 usr + 0.00 sys = 3.37 CPU) @ 29 +.68/s (n=100) Method Two: 4 wallclock secs ( 3.49 usr + 0.00 sys = 3.49 CPU) @ 28 +.61/s (n=100)
    or
    Benchmark: timing 100 iterations of Method One, Method Two... Method One: 4 wallclock secs ( 3.45 usr + 0.00 sys = 3.45 CPU) @ 29 +.00/s (n=100) Method Two: 3 wallclock secs ( 3.49 usr + 0.00 sys = 3.49 CPU) @ 28 +.62/s (n=100)
    So I would conclude that skipping the no-op would save me on the order 1 second for every 10 million assignments. Whether this matters for your use case depends on your use case. It would not matter for any of mine.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Thank you for the insights. In this case, I personally will "correct" it in my code. When I read the code next time, I don't have to think about this particular line of code.