jf1 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Perl monks, the other day I stumbled upon the issue that in a particular program declaring a variable with "my" actually slowed down program execution (for comparison I just repeated the very same instruction several times):
while (chomp ($line = <$gsr>)) { my @l1; @l1 = split $tab, $line; @l1 = split $tab, $line; @l1 = split $tab, $line; print(join($tab,@l1[@okc])); print(join($tab,@l1[@okc])); print(join($tab,@l1[@okc])); }
this leads to following timings:
73µs my@l1; 1.71s @l1 = split $tab, $line; 2.40s @l1 = split $tab, $line; 2.33s @l1 = split $tab, $line; 805ms print(join($tab,@l1[@okc])); 798ms print(join($tab,@l1[@okc])); 1.66s print(join($tab,@l1[@okc]));
i.e. the 2nd and following assignments seem to take more time than the 1st one. Could that be attributed to some additional overhead? It doesn't happen when omitting the "my" declaration or using "our" instead (see below). This is consistent across runs. If introducing "my" in the middle of a block the effect is the same (see at bottom). Using "local" instead of "my" seems not as expensive. But both "my" and "local" seem to slow down last write operation (is that some artifact? All measurments taken on Windows 7 with outpout redirected to a file).
235µs local @l1; 1.18s @l1 = split $tab, $line; 1.79s @l1 = split $tab, $line; 1.78s @l1 = split $tab, $line; 824ms print(join($tab,@l1[@okc])); 806ms print(join($tab,@l1[@okc])); 1.61s print(join($tab,@l1[@okc])); 92µs our @l1; 1.77s @l1 = split $tab, $line; 1.77s @l1 = split $tab, $line; 1.77s @l1 = split $tab, $line; 829ms print(join($tab,@l1[@okc])); 820ms print(join($tab,@l1[@okc])); 1.03s print(join($tab,@l1[@okc])); #my @l1; 1.78s @l1 = split $tab, $line; 1.77s @l1 = split $tab, $line; 1.77s @l1 = split $tab, $line; 810ms print(join($tab,@l1[@okc])); 805ms print(join($tab,@l1[@okc])); 1.01s print(join($tab,@l1[@okc])); 1.80s @l1 = split $tab, $line; 1.70s my @l1 = split $tab, $line; 2.41s @l1 = split $tab, $line; 807ms print(join($tab,@l1[@okc])); 794ms print(join($tab,@l1[@okc])); 1.65s print(join($tab,@l1[@okc]));

Replies are listed 'Best First'.
Re: "my" slowing down programs?
by dave_the_m (Monsignor) on Aug 17, 2015 at 07:15 UTC
    You're probably seeing the combination of two effects.

    First, a split to a non-lexical array is optimised. Rather than pushing all the split elements onto the stack and then letting the assign operator assign them to the array, split just adds its results directly to the array on the fly, bypassing the assign altogether. This optimisation was only extended to lexical arrays in 5.20.0 5.22.0.

    Secondly, when assigning values to an array, any previous contents have to be freed first. This can also take time. By using a 'my' variable, the array is initially empty (which is why the first assign is faster), but you pay for that on scope exit, when the array is emptied while going out of scope (which is why the last print is slower). Essentially you are shifting the cost from the first statement to the last statement.

    Dave.

      Used version is strawberry perl 5.20.2.1 (Windows 7; thus I suspected windows memory management); so I'd expect it to have this optimization built in.

      Your 2nd argument indeed is very intriguing, as this always happens in the last step of the block, no matter whether it's given once or repeated some times.
        Used version is strawberry perl 5.20.2.1
        Correction - the lexical optimisation was introduced in 5.22.0.

        Dave.

Re: "my" slowing down programs?
by 1nickt (Canon) on Aug 17, 2015 at 06:02 UTC

    First, if it's taking one to two seconds to split a line into an array, you have other problems! Also, you'll have to run your tests more than three times each to really benchmark things.

    Maybe look at Devel::NYTProf and/or Benchmark.

    #!/usr/bin/perl use strict; use warnings; use Benchmark qw[ timethese :hireswallclock]; sub with_my { while (<DATA>) { my @array = split; } } my @array2; sub without_my { while (<DATA>) { @array2 = split; } } for (1 ..5) { timethese (5000000, { with_my => 'with_my()', without_my => 'without_my()' }); } __DATA__ The quick brown fox jumps over the lazy dog.

    The results seem to show that the difference is negligible, IMO.

    Benchmark: timing 5000000 iterations of with_my, without_my... with_my: 11.0969 wallclock secs ( 6.75 usr + 3.42 sys = 10.17 CPU) + @ 491642.08/s (n=5000000) without_my: 10.6874 wallclock secs ( 6.84 usr + 3.41 sys = 10.25 CPU) + @ 487804.88/s (n=5000000) Benchmark: timing 5000000 iterations of with_my, without_my... with_my: 10.5256 wallclock secs ( 6.83 usr + 3.44 sys = 10.27 CPU) + @ 486854.92/s (n=5000000) without_my: 10.759 wallclock secs ( 7.02 usr + 3.52 sys = 10.54 CPU) +@ 474383.30/s (n=5000000) Benchmark: timing 5000000 iterations of with_my, without_my... with_my: 11.1877 wallclock secs ( 7.40 usr + 3.72 sys = 11.12 CPU) + @ 449640.29/s (n=5000000) without_my: 10.8896 wallclock secs ( 7.20 usr + 3.63 sys = 10.83 CPU) + @ 461680.52/s (n=5000000) Benchmark: timing 5000000 iterations of with_my, without_my... with_my: 11.0768 wallclock secs ( 7.33 usr + 3.69 sys = 11.02 CPU) + @ 453720.51/s (n=5000000) without_my: 10.9737 wallclock secs ( 7.28 usr + 3.65 sys = 10.93 CPU) + @ 457456.54/s (n=5000000) Benchmark: timing 5000000 iterations of with_my, without_my... with_my: 11.0938 wallclock secs ( 7.35 usr + 3.70 sys = 11.05 CPU) + @ 452488.69/s (n=5000000) without_my: 11.018 wallclock secs ( 7.29 usr + 3.69 sys = 10.98 CPU) +@ 455373.41/s (n=5000000)
    The way forward always starts with a minimal test.

      1nickt:

      I somehow doubt that you're testing what you think you're testing ... unless your snipped about 10,000,000 lines out of your __DATA__ section...

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

      Maybe I should have mentioned several facts I omitted in the original post, sorry. Here comes some background information:
      1. The numbers reported were not single measurements, but resulted from an NYTProf run with the script processing the 1st 100 lines of a template of the file under consideration. I consider this to be enough for getting halfway stable results
      2. Any of the lines processed contains about 30.000 fields. With this number the difference does matter, belief me. In particular as the regular final input file size in production has a range of between 1e4 and 1e5 fields and around 1e6 lines. Thus even saving 10% or 20% of runtime is worth the effort
      3. Thanks for the suggestion of the Benchmark module. Nonetheless I'd rather not repeat the processing 500.000 times as this would mean about 12 days only for line splitting with limited gain of knowledge ;-)
      In production with external time measures at least processing seems faster with global than lexical variables. BTW: Using C or Pascal instead and coding splitting, filtering and joining by hand does not seem to gain much here (to be honest: almost nothing), not to mention using provided libraries, which seem to be even slower as they favor generic solutions. This only costs more of programmer's time.
Re: "my" slowing down programs?
by BrowserUk (Patriarch) on Aug 17, 2015 at 04:41 UTC

    Try declaring the array before the loop:

    my @l1; while (chomp ($line = <$gsr>)) { @l1 = split $tab, $line; @l1 = split $tab, $line; @l1 = split $tab, $line; print(join($tab,@l1[@okc])); print(join($tab,@l1[@okc])); print(join($tab,@l1[@okc])); }

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
      Yeah, that had been the original approach, alas it makes matters worse: Doing that the 1st assignment within the loop is slowed down already.
Re: "my" slowing down programs?
by ikegami (Patriarch) on Aug 17, 2015 at 12:21 UTC

    But both "my" and "local" seem to slow down last write operation

    Well yeah. my creates a variable that only lasts until the end of the block. (Actually, nothing is not deallocated; it's just marked as empty.) local backs up the variable and restores it at the end of the block.

    the 2nd and following assignments seem to take more time than the 1st one. Could that be attributed to some additional overhead?

    Accessing lexicals have less overhead, actually. It's a single op, whereas two are needed for package variables. Also, looking up a lexical variable uses an index, while looking up a package variable requires locating a symbol by name.

    As you can see, the time needed to print is the same regardless of the type of variable, so this indicates there's something wonky going on here. If you actually performed a proper benchmark, I bet you'll find the difference is smaller than noise. [See dave_the_m's answer]

Re: "my" slowing down programs?
by soonix (Chancellor) on Aug 17, 2015 at 17:48 UTC
    dave_the_m's comment brings to me the question: which perl version are you using? Some of the optimizations in newer versions might speed things up, especially if you have such an amount of data.
Re: "my" slowing down programs?
by locked_user sundialsvc4 (Abbot) on Aug 17, 2015 at 12:08 UTC

    Your shop has created such an edge-case ... such an enormous volume of both variables and data that a single program is expected to process ... that I think your only realistic alternative (as a shop ...) is to:   (a) in the short run, “throw silicon at it.”   Then, (b) in some way, start fundamentally re-defining the problem and therefore its approach ... splitting or sharing the file such that multiple blade processors (not cores, not threads, not processes...) can tackle it in parallel, reducing the number of variables from 30-god-thousand to only what is required, and so on.

    The trouble with “removing my” is that it really is a fundamental logic-change to the program.   No matter how many years this piece of source-code has been buying your groceries, errors will be introduced, and when they do, how-the-hell will you know?   Data structures that used to be known-empty no longer are, and so on.   The business consequences could be disastrous indeed.   (And I don’t mean to under-state the business risk of “redefining the problem,” but it is a Hobson’s Choice by now.)

    Basically ... and this needs to be raised right-now as a senior management issue ... “we are running out of track.   We have been putting this thing off and trying different languages and so forth, but our data volume is catching up with us.”

      Completely agree with you here. This is an edge case. Indeed the job of the script is to extract a subset of the columns from the original data set and reduce amount of data.

      Presently the desire to preserve original ordering together with some infrastructural limitations hinder migration to Hadoop-like platforms.

      Nonetheless the present version in use already is quite fast in comparison to the version used before (could speed up the python version I got as a legacy to about using about 1/2 of original time and at same time still improve stability.) Ideas of migration to other languages (as Perl, ...) were born out of academic interest (after seeing the improvements possible just in the legacy code) but didn't result in much progress yet. (Finally another step forward came with a multithreaded tcl implementation.)

      During the process I at least learned to admire the guys who created the string splitting and joining routines in the various scripting languages. These are enlightening examples of highly optimized code.

      P.S.: perl version is embarrassingly short, use of global variables is definitly no issue; I only was amazed to see that the use of lexical variables resulted in a performance penalty.