{}think has asked for the wisdom of the Perl Monks concerning the following question:

This question regards reusing an array as I spin through a loop, and creating references to that array inside the loop. My intuition tells me this can be fraught with unintentional side effects, and I have designed a solution that works. I have also created a snippet that does things the 'wrong' way. So I've got my solution and should be happy... problem is, I don't understand what the bad code is doing, and worse, I don't get WHY the good code works! Intuition has served me well, but now I seek knowledge!!!
My goal is to put information in an array "@a" every time I iterate through a loop, and then push a reference to that array onto an outside array called "@container". I want to clear out @a each time. My concern is that if I do this wrong, all the entries in @container will point to the same location in memory, which will be a nasty mishmash.

So here's the technique that I knew was wrong:

#!/usr/bin/perl -w use strict; use Data::Dumper; #---- The wrong way my @container=(); my @a=(); for (my $i=0; $i <=1; $i++){ @a=(); $a[$i] = 99; push @container, \@a; } print Dumper(\@container);

And here is the output:

$VAR1 = [
          [
            undef,
            99
          ],
          $VAR1->[0]
        ];

Problem 1: I don't really understand why it resulted in that.

So, here is the solution that I designed which does work the right way. This version differs in that it declares the array inside of the scope of the loop. It works.

#!/usr/bin/perl -w use strict; use Data::Dumper; #By using careful scoping, you can control allocation of an array that + you intend to reuse. #---- The right way { my @container=(); for (my $i=0; $i <=1; $i++){ my @a=(); #@a is declared INSIDE the scope of the lo +op $a[$i] = 99; push @container, \@a; } print Dumper(\@container); }

The output is as I desire. There are two arrays, and their contents are independent of on another. The first array has its first element set to 99, and the second array has its second element (but not its first element) set to 99. Problem #2: I don't understand why my solution works!

$VAR1 = [
          [
            99
          ],
          [
            undef,
            99
          ]
        ];

{}think; #Think outside of the brackets

Replies are listed 'Best First'.
Re: Scope and references
by BrowserUk (Patriarch) on Jun 19, 2011 at 15:59 UTC

    With the "wrong" version, @a is defined outside the loop, so each time you take its address, you get the same value.

    with the "right" version, @a is defined inside the loop, and so a new array is created each time you iterate the loop. Hence you get different addresses each time.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks. But that's what I don't get. Here's my confusion. I never realized that a simple loop like this (an even simpler illustration this time) ...

      for (my $i=0; $i <=1; $i++){ my @a; $a[$i] = 99; print Dumper(\@a); }

      ... which results in ...

      $VAR1 = [
                99
              ];
      $VAR1 = [
                undef,
                99
              ];
      
      

      ...actually allocated and destroyed a new array every time through the loop. Since you go around the loop twice, it seems like you don't "leave" the scope in which @a was delcared, so @a should still "be there" during the second iteration. I still don't get that.

      Also, I still don't understand why the 'wrong' version creates the data structure that it does. Any thoughts on that one?
      {}think; #Think outside of the brackets
        I never realized that a simple loop like this ... actually allocated and destroyed a new array every time through the loop. Since you go around the loop twice, it seems like you don't "leave" the scope in which @a was delcared, so @a should still "be there" during the second iteration. I still don't get that.

        In for(...) { ... } the curly braces define a block. Each time the loop iterates you re-enter that block. And every time you enter a block, you enter a new scope.

        That is just the way it is, and the way it is meant to be.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Actually, for creates two lexical scopes. One for the entire statement, and one of the block body.

        You've already demonstrated the second. Here's a demonstration of the first:

        >perl -e"use strict; for (my @x) { } @x" Global symbol "@x" requires explicit package name at -e line 1. Execution of -e aborted due to compilation errors.

        it seems like you don't "leave" the scope in which @a was delcared,

        It's pretty clear to me the curly is the end of the scope, and you do indeed reach it.

        Also, I still don't understand why the 'wrong' version creates the data structure that it does. Any thoughts on that one?

        my @a; push @container, \@a; push @container, \@a;

        Isn't it clear that it puts two references to the same variable into @container? Why do you think your loop is different?

        It might (or might not) be helpful to realize that my is an executable statement, not "just" a declaration.

        update The following is more misleading than helpful. Please see the followup "puzzle" Re^4: Scope and references to see why. It produces a new instance of its argument(s), effectively unrelated (except by name) to previous instances created in that scope, and any previous instance has its reference count reduced by 1. If you have tucked a reference to a previous instance into an array, as you did in your first example, that keeps the reference count positive, so the instance does not get garbage collected. If there is no other reference to a previous instance, it ceases to exist as far as you are concerned.

        This isn't simple stuff, most of us have been tripped up by something similar when we first started using lexical variables.

Re: Scope and references
by AnomalousMonk (Archbishop) on Jun 19, 2011 at 19:59 UTC
Re: Scope and references
by 7stud (Deacon) on Jun 19, 2011 at 20:59 UTC
    Also, I still don't understand why the 'wrong' version creates the data structure that it does. Any thoughts on that one?
    use strict; use warnings; use 5.010; use Data::Dumper; my @arr = (); $arr[3] = 40; say Dumper(\@arr); --output:-- $VAR1 = [ undef, undef, undef, 40 ];

    And from the Data::Dumper docs:

    ... duplicate references to substructures within $VARn will be appropriately labeled using arrow notation.

    The second element of your $VAR1 array is a duplicate reference, and instead of bothering to print it out again, Data::Dumper essentially says, "If you want to see what the second element looks like, go look at the first element, $VAR1->[0], because I won't be bothered trying to format that pretty output again.

      It's not a question of not being bothered. It's doing it the way to indicate both references refer to the same variable.
        Touche! In other words, if Dumper just displayed the same thing for the second element as the first element, you wouldn't know whether they were the same array or whether they were different arrays with identical content. ++
      Thanks to all of you for the GREAT explanations. I have learned much!
      {}think; #Think outside of the brackets