Intro

Ok, I know the reaction when you'll read the title of this meditation: don't. Well even before saying anything else let me tell you that I agree: don't. But please read on...

Indeed this stems directly out of a clpmisc post entitled Modifying $_ in "map", with an array containing a gap... (link @ GG - I won't put others, but if you don't have a newsreader installed, then you can give the msg id as an argument to http://groups.google.com/groups?threadm= and you will be brought to the right post in the context of its thread) and my own first reaction was: don't! Actually C<for> and map have different typical applications and that the latter may be abused due to some side effects instead of the former is a whole another story. Incidentally just a few days ago Tanktalus used them as topical examples of tools that have similarities but are "merely tangentially related". Anyway reading on the thread I realized that the question was subtler and less trivial than I would have expected. To quote the first post:

I'm sorry I'm not good at English. :-) foreach and map functions show the same result when an array has no gap. @array = (1, 2, 3, 4); foreach (@array) { $_ *= 10 } # now, $array = (10, 20, 30, 40) @array = (1, 2, 3, 4); map { $_ *= 10 } @array; # now, $array = (10, 20, 30, 40) However, if an array contains a gap... 1 $array1[0] = 0; 2 $array1[9] = 9; # now $array1 = (0, undef, undef, ... , 9); 3 print "@array1", "\n"; # 0 "" "" ... "" 9 4 foreach (@array1) { 5 $_ *= 10 # $array1 = (0, 0, 0, ... , 90) 6 } 7 print "@array1", "\n"; # 0 0 0 ... 0 90 8 9 10 $array2[0] = 0; 11 $array2[9] = 9; 12 print "@array2", "\n"; # 0 "" "" ... "" 9 13 map { $_ *= 10 } @array2; # ERROR!!!!!! 14 print "@array2", "\n"; line 1-7 work well, but using map, line 13 reports an error: Modification of a read-only value attempted at t2.pl line 13. Before line 13, line 12 prints the intervening elements, treating undef as null string. Then why does line 13 make such error? Is it a bug? or...?

In what follows I'll largely quote and paraphrase from that thread: read it if you want to pinpoint exact attributions, I'm not claiming anything for myself. It was just instructive for me and I'm reporting it here, although I'm sure many monks already knew.

Explanation

The explanation was given, amongst others, by Brian McCauley and "Xho": yes it's a bug or at least a mal-feature, the subtlety -due to a special kind of fly-weight undef used in sparse arrays- being the difference between elements of an array that are undef and ones that are non-existent, as explained in perldoc -f exists. The latter ones should get promoted to the other kind of undef, -specifically the "one true undef" aka PL_sv_undef- automatically when needed. Apparently being aliased through map or grep inhibits this automatic promotion, while being aliased through foreach doesn't.

Example:

picard:~ [13:47:28]$ perl -le '$x[5]=5; $_*=10 for @x; print "@x"' 0 0 0 0 0 50 picard:~ [13:47:32]$ perl -le '$x[5]=5; $_*=10 for grep 1, @x; print " +@x"' Modification of a read-only value attempted at -e line 1. picard:~ [13:47:36]$ perl -le '@x=0..5; $_*=10 for grep 1, @x; print " +@x"' 0 10 20 30 40 50

Mirco Wahab experimented a little with Devel::Peek:

Starting with:

my @array; $array[0] = 0; $array[9] = 9;

it can be seen that the elements 1..8 will be NULL SV's with a refcount of 1:

SV = NULL(0x0) at 0x182d528 REFCNT = 1 FLAGS = ()

When aliased through for, these NULL SV's are replaced dynamically by something strange:

SV = PVLV(0x18b3164) at 0x182d510 REFCNT = 2 FLAGS = (GMG,SMG) IV = 0 NV = 0 PV = 0 MAGIC = 0x18e3474 MG_VIRTUAL = &PL_vtbl_defelem MG_TYPE = PERL_MAGIC_defelem(y) TYPE = y TARGOFF = 1 TARGLEN = -1 TARG = 0x1b8d8e4 SV = PVAV(0x22b4c4) at 0x1b8d8e4 REFCNT = 3 FLAGS = (PADBUSY,PADMY) IV = 0 NV = 0 ARRAY = 0x20093bc FILL = 9 MAX = 11 ARYLEN = 0x0 FLAGS = (REAL) Elt No. 0 SV = IV(0x1823db8) at 0x1ffca94 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 0 Elt No. 1 Elt No. 2 Elt No. 3

When aliased through map, the same NULL SV's will keep NULL and get refcounts of -2^31 plus the READONLY flag:

SV = NULL(0x0) at 0x224b30 REFCNT = 2147479514 FLAGS = (READONLY)

Further subtleties

Further oddities happen when you pass an array with gaps to a subroutine: a gap in the argument will leave one in @_ and modifying the missing element in it gives no error but does not modify the missing element in the original array:

picard:~ [13:53:28]$ cat inc.pl #!/usr/bin/perl -l use strict; use warnings; sub inc { $_++ for @_ } my @array; $array[1]=666; inc @array; print "@array"; __END__ picard:~ [13:53:31]$ ./inc.pl Use of uninitialized value in join or string at ./inc.pl line 11. 667

Again, Mirco Wahab verified that in this case in the sub, the NULL SV is autovivicated, but after exiting the sub, the vivicated SV* isn't propagated back to the original array:

picard:~ [14:15:22]$ cat inc.pl #!/usr/bin/perl use strict; use warnings; use Devel::Peek 'Dump'; sub inc { $_++ for @_; Dump $_[0] } my @array; $array[1]=666; inc @array; Dump $array[0]; __END__ picard:~ [14:15:35]$ ./inc.pl SV = PVNV(0x8152908) at 0x819ab20 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1 NV = 0 PV = 0 SV = NULL(0x0) at 0x819ab20 REFCNT = 1 FLAGS = ()

Replies are listed 'Best First'.
Re: map vs for (not the usual breed)
by shmem (Chancellor) on Jun 27, 2007 at 13:53 UTC
    Indeed. I stumbled over the Further subtleties in another recent thread, trying to explain what was happening if an array with holes is passed into a sub. ysth corrected me there.

    There's a distinction between nonexistent and nonexistent; if you pass array elements into to a sub, even nonexistent slots will be created and assigned, and the underlying array expanded as appropriate, whereas if you pass in an array, it is flattened into a list, and nonexisting array slots aren't made into aliases.

    The behaviour might be called a bug or a malfunction, but thinking about it - it could be taken as a feature also, e.g. for chaining subs where you pass @_ along, as in

    #!/usr/bin/perl $\ = "\n"; @y[3,5] = (13,15); @x = x(@y); print '@y = (',join (',', map {"'$_'"} @y),')'; print '@x = (',join (',', map {"'$_'"} @x),')'; sub x { $_++ for @_; @_[1,4] = (2,5); &z; } sub z { $_ *= 3 for @_; @_; } __END__ @y = ('','','','42','','48') @x = ('3','6','3','42','15','48')

    and expect non-existing slots of the original array not to be modified, i.e. you want the holes preserved. I guess that could make perfect sense for e.g. matrix operations, where you don't want undefined elements of sparse arrays spring into existense merely because you $_ **= 2 for @_.

    Of course that behaviour should be documented somewhere (not only in PM threads :-)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      sub x { $_++ for @_; @_[1,4] = (2,5); &z; }

      You're probably right (in your doubt that this may be feature rather than a mal-feature - I see arguments for both possibilities), but ability to modify @_ and &-form of sub call still existing apart, I suppose that if one really wanted to do this sort of manipulations, then the way to go would be would be by means of magic goto, which would avoid the call altogether, and thus using the stack for... err... well, nothing.

        Well no... it would not avoid the call, but fake the current call as that one.

        The magic goto really introduces a bit more (or another) overhead, in that it looks back and forth, saves @_, carefully disassembles the current stack frame and replaces it with the one of the called sub, and then restores @_. The magic goto is most useful to replace the current subroutine, mainly to eliminate AUTOLOAD from what caller() might report. So in replacing &z with goto &z we are trading localizing @_ against disassembly of a whole sub frame. I don't know what's "cheaper".

        &z is really a shorthand of z(@_) which additionally overrides the argument prototypes of sub z (if any).

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}