in reply to list assignment and undef

You called $x a lexical, but that's only semi-true. $x is aliased to undef, so $x is whatever undef returns.

When a LHS element of a list assignment is undef*, Perl apparently creates an SV to contain the assigned value. These are returned by the list assignment, and these are the values you are dumping for the first and third pass of the inner loop.

There doesn't seem to be anything special about the loop wrt undefined values. It seems to be the list assignment creating the new SVs.

* — Probably not any undefined value, just those that look like they were returned by undef.

Replies are listed 'Best First'.
Re^2: list assignment and undef (actual)
by ikegami (Patriarch) on Aug 25, 2009 at 18:09 UTC

    So I looked at Perl's source code a bit, and I've found out what really happens. I was right that for had nothing to do with it, but I wasn't entirely correct with respect to the assignment operator.


    You called $x a lexical, but that's only semi-true. $x is aliased to undef, so $x is whatever undef returns.

    The important bit you're missing follows: When a LHS element of a list assignment is immortal*, Perl pretends the RHS element is on the LHS for the purpose of building the return value.**

    $ perl -wle'$x="xx";$y="yy"; $_=uc for ($y)=$x; print "$x$y"' xxXX $ perl -wle'$x="xx";$y="yy"; $_=uc for (undef)=$x; print "$x$y"' XXyy

    This means the first and third element of the list returned by the range operator (..) is being returned by the assignment operator, and these are the values you are dumping for the first and third pass of the inner loop.

    There's nothing special about the loop wrt undefined values. The new SVs are created by the range operator (..).

    * — One of PL_sv_undef, PL_sv_yes, PL_sv_no and PL_sv_placeholder.

    ** — This isn't documented. This was gleaned from the Perl source code.

      Very interesting. Where in the source were you looking? I haven't got much past the tokeniser yet, but was initially motivated to get a better understanding of exactly these sorts of behaviors. I may get on to the parser soon, and then to the op tree and execution... But a jump ahead might be an interesting / helpful diversion.

      It seems that different cases are handled quite differently, making it hard to provide any simple explanation of what is happening. Perhaps for about the same reason that "only perl can parse Perl", only the full source can explain what perl does.

      use strict; use warnings; use Devel::Peek; print "First\n"; Dump($_) for (1..2); print "Second\n"; Dump($_++) for (1..2); print "Third\n"; Dump($_++) for (1..2); print "Fourth\n"; Dump($_) for (1..2); print "Fifth\n"; Dump($_) for ( (undef, undef) = (1..2) );

      produces

      First SV = IV(0x86b698c) at 0x86b6990 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1 SV = IV(0x86b698c) at 0x86b6990 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 2 Second SV = IV(0x86cf3bc) at 0x86cf3c0 REFCNT = 1 FLAGS = (PADTMP,IOK,pIOK) IV = 1 SV = IV(0x86cf3bc) at 0x86cf3c0 REFCNT = 1 FLAGS = (PADTMP,IOK,pIOK) IV = 2 Third SV = IV(0x86dc45c) at 0x86dc460 REFCNT = 1 FLAGS = (PADTMP,IOK,pIOK) IV = 1 SV = IV(0x86dc45c) at 0x86dc460 REFCNT = 1 FLAGS = (PADTMP,IOK,pIOK) IV = 2 Fourth SV = IV(0x86b698c) at 0x86b6990 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1 SV = IV(0x86b698c) at 0x86b6990 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 2 Fifth SV = IV(0x86b687c) at 0x86b6880 REFCNT = 2 FLAGS = (IOK,pIOK) IV = 1 SV = IV(0x86b698c) at 0x86b6990 REFCNT = 2 FLAGS = (IOK,pIOK) IV = 2

      The first case seems inconsistent with the description in Statement Modifiers:

      The "foreach" modifier is an iterator: it executes the statement once for each item in the LIST (with $_ aliased to each item in turn).

      In the first case, a single variable (SV at the same address and with an IV at the same address) has different values assigned to the IV in each iteration. This appears to be a single variable modified at run time rather than aliases to the distinct elements of a list created at compile time.

      In the second case, despite that the only difference from the first case is the post-increment on $_, the variables are quite different: this time each SV is at a different address (different from each other and different from the single address in the first case), but the associated IVs are, in each iteration, at the same address, though this is a different address from the first case. Furthermore, they now have PADTMP set. The fact that the IV is at the same address in each iteration suggests that these are still not simple aliases to distinct elements of a list - the IV is being modified at run time.

      The third case clarifies that $_ isn't simple aliases to the list elements: Again each iteration has an SV at a different address - different from each other and different from those in the second case. There is, again, one IV for all iterations but its address is not the same as in the second case. Furthermore, the same values appear in the IVs as in the second case. Thus, either $_ isn't an alias to the elements of the list or a separate list is produced for each instance of (1..2) as the post-increment in the second case doesn't affect the values seen in the third case. This is quite different from the results in your previous example.

      This defies any simple explanation of cached lists generated at compile time and $_ being aliased to elements of such lists.

      The fourth case produces the exact same result as the first case. Thus there are not simply new SVs and IVs generated each time. There is some reuse.

      The fifth case is back to list assignment. This case is different again: in each iteration $_ is an SV with a different address (different in each iteration and different from all previous cases) and each SV has an IV at a different address (again, different in each iteration and different from all previous cases). Thus, it seems that when the LHS of a list assignment in list context is immortal the SV (is it an lvalue?) produced is not simply that from the RHS.

      By the way, I am looking at this in an effort to improve the documentation of the assignment operator (http://rt.perl.org/rt3/Public/Bug/Display.html?id=68312). In the beginning I simply wanted to add definition of what "list assignment" and "scalar assignment" are as the terms were already being used but without definition. It seems I have jumped into a barrel of worms - as with the cat in the box, it is indeterminate whether this is more or less fun than a barrel of monkeys, or whether we are all dead or alive.

        Very interesting. Where in the source were you looking?

        pp_aassign in pp_hot.c. The ops (as seen in -MO=Concise) are found in pp_*.c. Prefix pp_ to the name seen in -MO=Concise and that's the name of the function that implements them.

        It seems that different cases are handled quite differently

        I don't know why you say that. The only difference I see is accounted for by the fact that post-increment (and post-decrement) return a copy of the variables original value. (They can't return the variable they are incrementing or decrementing because it no longer has the right value.)

        This defies any simple explanation of cached lists generated at compile time

        I don't know how you can say that. You haven't tested cached lists at all. You never use the same list twice.