Scope and references

{}think has asked for the wisdom of the Perl Monks concerning the following question:

This question regards reusing an array as I spin through a loop, and creating references to that array inside the loop. My intuition tells me this can be fraught with unintentional side effects, and I have designed a solution that works. I have also created a snippet that does things the 'wrong' way. So I've got my solution and should be happy... problem is, I don't understand what the bad code is doing, and worse, I don't get WHY the good code works! Intuition has served me well, but now I seek knowledge!!!
My goal is to put information in an array "@a" every time I iterate through a loop, and then push a reference to that array onto an outside array called "@container". I want to clear out @a each time. My concern is that if I do this wrong, all the entries in @container will point to the same location in memory, which will be a nasty mishmash.

So here's the technique that I knew was wrong:

#!/usr/bin/perl -w

use strict;
use Data::Dumper;

#---- The wrong way

    my @container=();
    my @a=();
    for (my $i=0; $i <=1; $i++){
        @a=();
        $a[$i] = 99;

        push @container, \@a;
    }
    print Dumper(\@container);
[download]

And here is the output:

$VAR1 = [
          [
            undef,
            99
          ],
          $VAR1->[0]
        ];

Problem 1: I don't really understand why it resulted in that.

So, here is the solution that I designed which does work the right way. This version differs in that it declares the array inside of the scope of the loop. It works.


#!/usr/bin/perl -w

use strict;
use Data::Dumper;

#By using careful scoping, you can control allocation of an array that
+ you intend to reuse.

#---- The right way
{
    my @container=();
    for (my $i=0; $i <=1; $i++){
        my @a=();           #@a is declared INSIDE the scope of the lo
+op
        $a[$i] = 99;

        push @container, \@a;
    }
    print Dumper(\@container);
}
[download]

The output is as I desire. There are two arrays, and their contents are independent of on another. The first array has its first element set to 99, and the second array has its second element (but not its first element) set to 99. Problem #2: I don't understand why my solution works!

{}think; #Think outside of the brackets

Comment on Scope and references Select or Download Code

Replies are listed 'Best First'.
Re: Scope and references by BrowserUk (Patriarch) on Jun 19, 2011 at 15:59 UTC
With the "wrong" version, @a is defined outside the loop, so each time you take its address, you get the same value. with the "right" version, @a is defined inside the loop, and so a new array is created each time you iterate the loop. Hence you get different addresses each time. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^2: Scope and references by {}think (Sexton) on Jun 19, 2011 at 16:52 UTC
Thanks. But that's what I don't get. Here's my confusion. I never realized that a simple loop like this (an even simpler illustration this time) ... `for (my $i=0; $i <=1; $i++){ my @a; $a[$i] = 99; print Dumper(\@a); }` [download] ... which results in ... $VAR1 = [ 99 ]; $VAR1 = [ undef, 99 ]; ...actually allocated and destroyed a new array every time through the loop. Since you go around the loop twice, it seems like you don't "leave" the scope in which @a was delcared, so @a should still "be there" during the second iteration. I still don't get that. Also, I still don't understand why the 'wrong' version creates the data structure that it does. Any thoughts on that one? {}think; #Think outside of the brackets	[reply] [d/l]
Re^3: Scope and references by BrowserUk (Patriarch) on Jun 19, 2011 at 17:41 UTC
I never realized that a simple loop like this ... actually allocated and destroyed a new array every time through the loop. Since you go around the loop twice, it seems like you don't "leave" the scope in which @a was delcared, so @a should still "be there" during the second iteration. I still don't get that. In `for(...) { ... }` the curly braces define a block. Each time the loop iterates you re-enter that block. And every time you enter a block, you enter a new scope. That is just the way it is, and the way it is meant to be. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: Scope and references by {}think (Sexton) on Jun 19, 2011 at 18:44 UTC
Re^4: Scope and references by {}think (Sexton) on Jun 22, 2011 at 02:33 UTC
Re^5: Scope and references by Anonymous Monk on Jun 23, 2011 at 13:05 UTC
Re^3: Scope and references by ikegami (Patriarch) on Jun 19, 2011 at 19:35 UTC
Actually, `for` creates two lexical scopes. One for the entire statement, and one of the block body. You've already demonstrated the second. Here's a demonstration of the first: `>perl -e"use strict; for (my @x) { } @x" Global symbol "@x" requires explicit package name at -e line 1. Execution of -e aborted due to compilation errors.` [download] it seems like you don't "leave" the scope in which @a was delcared, It's pretty clear to me the curly is the end of the scope, and you do indeed reach it. Also, I still don't understand why the 'wrong' version creates the data structure that it does. Any thoughts on that one? `my @a; push @container, \@a; push @container, \@a;` [download] Isn't it clear that it puts two references to the same variable into @container? Why do you think your loop is different?	[reply] [d/l] [select]
Re^4: Scope and references by 7stud (Deacon) on Jun 20, 2011 at 08:01 UTC
Re^5: Scope and references by ikegami (Patriarch) on Jun 20, 2011 at 17:03 UTC
Re^3: Scope and references by jpl (Monk) on Jun 19, 2011 at 18:31 UTC
It might (or might not) be helpful to realize that `my` is an executable statement, not "just" a declaration. update The following is more misleading than helpful. Please see the followup "puzzle" Re^4: Scope and references to see why. It produces a new instance of its argument(s), effectively unrelated (except by name) to previous instances created in that scope, and any previous instance has its reference count reduced by 1. If you have tucked a reference to a previous instance into an array, as you did in your first example, that keeps the reference count positive, so the instance does not get garbage collected. If there is no other reference to a previous instance, it ceases to exist as far as you are concerned. This isn't simple stuff, most of us have been tripped up by something similar when we first started using lexical variables.	[reply]
Re^4: Scope and references by jpl (Monk) on Jun 20, 2011 at 10:51 UTC
Re: Scope and references by AnomalousMonk (Archbishop) on Jun 19, 2011 at 19:59 UTC
It might also help to read some of the Variables and Scoping tutorials, perhaps especially Coping with Scoping as it has a couple of examples that touch (albeit lightly) upon {}think's for-loop concerns.	[reply]
Re: Scope and references by 7stud (Deacon) on Jun 19, 2011 at 20:59 UTC
Also, I still don't understand why the 'wrong' version creates the data structure that it does. Any thoughts on that one? `use strict; use warnings; use 5.010; use Data::Dumper; my @arr = (); $arr[3] = 40; say Dumper(\@arr); --output:-- $VAR1 = [ undef, undef, undef, 40 ];` [download] And from the Data::Dumper docs: ... duplicate references to substructures within $VARn will be appropriately labeled using arrow notation. The second element of your $VAR1 array is a duplicate reference, and instead of bothering to print it out again, Data::Dumper essentially says, "If you want to see what the second element looks like, go look at the first element, $VAR1->[0], because I won't be bothered trying to format that pretty output again.	[reply] [d/l]
Re^2: Scope and references by ikegami (Patriarch) on Jun 20, 2011 at 03:48 UTC
It's not a question of not being bothered. It's doing it the way to indicate both references refer to the same variable.	[reply]
Re^3: Scope and references by 7stud (Deacon) on Jun 20, 2011 at 08:15 UTC
Touche! In other words, if Dumper just displayed the same thing for the second element as the first element, you wouldn't know whether they were the same array or whether they were different arrays with identical content. ++	[reply]
Re^2: Scope and references by {}think (Sexton) on Jun 20, 2011 at 00:03 UTC
Thanks to all of you for the GREAT explanations. I have learned much! {}think; #Think outside of the brackets	[reply]