Reference of constants and literals

LanX has asked for the wisdom of the Perl Monks concerning the following question:

I noticed a fundamental difference between literal scalars and hashes/arrays.

use strict;
use warnings;


sub pr {
    print \$_[0],"\t";      # print ref of para
}


check(qw/ Scalar 1 /);
check(qw/ Array [1,2] /);
check(qw/ Hash  {1,2} /);


sub check {
    my ($_title_,$type) = @_;
    my $_call_ ="pr $type;";
    my $code= <<"__EOC";
    for (1..3) {
    $_call_
        print "\\n\\t"; #UPDATE
    for (1..3) {
        $_call_
    }
    print "\\n";
    }
__EOC
    print "\n--- $_title_ \n";
    print $code;
    eval $code;
}
[download]

output:

--- Scalar 
    for (1..3) {
    pr 1;
        print "\n\t";
    for (1..3) {
        pr 1;
    }
    print "\n";
    }
SCALAR(0x81953b0)    
    SCALAR(0x8195398)    SCALAR(0x8195374)    SCALAR(0x8195398)    
SCALAR(0x8195374)    
    SCALAR(0x819538c)    SCALAR(0x8195398)    SCALAR(0x819538c)    
SCALAR(0x8195398)    
    SCALAR(0x81953b0)    SCALAR(0x819538c)    SCALAR(0x81953b0)    

--- Array 
    for (1..3) {
    pr [1,2];
        print "\n\t";
    for (1..3) {
        pr [1,2];
    }
    print "\n";
    }
REF(0x819073c)    
    REF(0x8190724)    REF(0x8190724)    REF(0x8190724)    
REF(0x819073c)    
    REF(0x8190724)    REF(0x8190724)    REF(0x8190724)    
REF(0x819073c)    
    REF(0x8190724)    REF(0x8190724)    REF(0x8190724)    

--- Hash 
    for (1..3) {
    pr {1,2};
        print "\n\t";
    for (1..3) {
        pr {1,2};
    }
    print "\n";
    }
REF(0x8195374)    
    REF(0x8190748)    REF(0x8190748)    REF(0x8190748)    
REF(0x81953c8)    
    REF(0x8190748)    REF(0x8190748)    REF(0x8190748)    
REF(0x8195374)    
    REF(0x8190748)    REF(0x8190748)    REF(0x8190748)
[download]

As you can see the ref of the arrays and hashes depend on the codeposition while the refs of the scalars are not ...

my motivation is to find a way to distinguish different calls to the same sub by the codeposition.

(please note that caller gives only the line number!)

QUESTION: Is this a defined behaviour, and why does the compiler not just send one stable ref too a scalar?

Cheers Rolf

UPDATE: changed intendation of inner loop and added question!

UPDATE: OK there is no need to discuss this further.

as an outcome it is clear that perl may reallocate memory for data which is obviously constant at compiletime. In other words, the reference to a constant or literal at a special codeposition may always change during runtime and is not fixed at compiletime

Comment on Reference of constants and literals Select or Download Code

Replies are listed 'Best First'.
Re: Reference of constants and literals by moritz (Cardinal) on Nov 24, 2008 at 10:09 UTC
I don't quite understand your question. When you have a number, and take a reference to it, it will appear as `SCALAR(0x...)` in the output. If you have an array reference (or reference to any data structure), and take a reference to it, it will appear as `REF(0x...)`. I don't see any relation to "code position", whatever you mean by that. `$ # also works for scalars this way, when you take a ref to a ref: $ perl -wle 'print \\4' REF(0x5043b0)` [download] (And if you use `<code>...</code>` tags to delimit your code the square brackets will display correctly, and don't turn into links)	[reply] [d/l] [select]
Re^2: Reference of constants and literals by Corion (Patriarch) on Nov 24, 2008 at 10:19 UTC
I think it's the symptom of an idea I planted in LanX' head yesterday, the idea of determining that the code is getting called from the same location because it gets passed the same (addresses of) variables. For arrays/hashes, it seems the code actually gets the same addresses of variables even though these are lexicals.	[reply]
Re^3: Reference of constants and literals by moritz (Cardinal) on Nov 24, 2008 at 10:29 UTC
That might be a nice idea for an obfuscation context, but certainly not for production code. It relies on an undocumented behaviour, here an optimization that might very well be changed in future. I think it's related to the optimization described in No garbage collection for my-variables (not sure though).	[reply]
Re^4: Reference of constants and literals by ikegami (Patriarch) on Nov 24, 2008 at 11:00 UTC
Re^5: Reference of constants and literals by JavaFan (Canon) on Nov 24, 2008 at 11:37 UTC
Some notes below your chosen depth have not been shown here
Re^2: Reference of constants and literals by LanX (Saint) on Nov 24, 2008 at 10:46 UTC
Well it's a question of weird optimization, a constant scalar is passed in a certain snippet of code, so no need to switch the reference at runtime. This works with explicit refs like `[1,2]` and `{1,2}` but not with constants, they needlessly get at runtime a new reference, each time the loop gets there... just compare: `check(qw/ Scalarref \1 /);` OUTPUT `--- Scalarref for (1..3) { pr \1; print "\n\t"; #UPDATE for (1..3) { pr \1; } print "\n"; } REF(0x8190768) REF(0x8190744) REF(0x8190750) REF(0x8190744) REF(0x8190750) REF(0x8195394) REF(0x8190744) REF(0x8195394) REF(0x8190744) REF(0x8190768) REF(0x8195394) REF(0x8190768)` [download] each time a new ref instead of one ref Cheers Rolf	[reply] [d/l] [select]
Re: Reference of constants and literals by ikegami (Patriarch) on Nov 24, 2008 at 10:37 UTC
Is this a defined behaviour At the very least, you're relying on the memory allocation system allocating the same block twice in a row. That sounds very fragile to me. I can easily see this failing for non-trivial `pr` or in a multi-threaded application. why does the compiler not just send one stable ref too a scalar? The memory allocation needs of "creating an array, creating two scalars, assigning the scalars to the array, creating a reference to the array, returning the reference and passing it to a function" (`pr [1,2]`) are very different than the memory allocation needs of "passing a constant to a function" (`pr 1`).	[reply] [d/l] [select]
Re: Reference of constants and literals by ikegami (Patriarch) on Nov 24, 2008 at 10:18 UTC
As you can see the ref of the arrays and hashes depend on the codeposition while the refs of the scalars are not ... The only thing you've passed to `pr` is scalars. You never pass an array or hash, just references to them. It's not even possible to pass arrays or hashes to subroutines. That means you're wrong about having printed refs to arrays (`ARRAY(0x...)`) and refs to hashes (`HASH(0x...)`). The only thing you've printed are references to scalars. It should read: As you can see the ref of the refs depend on the codeposition while the refs of the constants are not ... Now, what's your question?	[reply] [d/l] [select]
Re^2: Reference of constants and literals by LanX (Saint) on Nov 24, 2008 at 10:30 UTC
Hi I just made the code and output clearer and added a question. > You never pass an array or hash, just references to them. It's not even possible to pass arrays or hashes to subroutines. Thats a matter of interpretation, the behaviour of `push @arr, "elem"` can be reproduce with prototypes `sub name (\@@)` Cheers Rolf	[reply] [d/l] [select]
Re^3: Reference of constants and literals by ikegami (Patriarch) on Nov 24, 2008 at 10:44 UTC
First of all, `push` is not a subroutine. I'll concentrate on your second example, `sub name (\@@)`. While it could be a matter of interpretation in general, it's unequivocal in this case because we're talking about the value of `$_[0]`. The `\@` prototype causes a reference to the array to be passed to the sub, not an array. `$_[0]` contains a reference, not an array. Printing `\$_[0]` would print a reference to a reference to an array, not a reference to an array.	[reply] [d/l] [select]
Re^4: Reference of constants and literals by LanX (Saint) on Nov 24, 2008 at 11:11 UTC
Re^5: Reference of constants and literals by ikegami (Patriarch) on Nov 24, 2008 at 11:30 UTC
Some notes below your chosen depth have not been shown here
Re^3: Reference of constants and literals by moritz (Cardinal) on Nov 24, 2008 at 10:34 UTC
Thats a matter of interpretation, the behaviour of `push @arr, "elem"` can be reproduce with prototypes `sub name (\@@)` The prototype is just syntactic sugar for taking a reference (plus extra behaviour, for example enforcing list context), so independently of what it looks like on the caller side, the callee always sees a reference, never the array itself.	[reply] [d/l] [select]
Re^4: Reference of constants and literals by LanX (Saint) on Nov 24, 2008 at 11:28 UTC
Re^5: Reference of constants and literals by ikegami (Patriarch) on Nov 24, 2008 at 11:49 UTC
Some notes below your chosen depth have not been shown here
Re^2: Reference of constants and literals by LanX (Saint) on Nov 24, 2008 at 11:35 UTC
Now, what's your question? Well, why the heck do constants get a new allocated place? Doesn't seem to me as if the compiler does optimisation right!	[reply]
Re^3: Reference of constants and literals by Anonymous Monk on Nov 25, 2008 at 08:33 UTC
?? You are not supposed to care one way or the other.	[reply]