in reply to Re: map and return
in thread map and return

First many thanks for giving the internal perspective (I was hoping you would take the time to do this!). I am not terribly familiar with the Perl source code, but it seems that the op code names in opcode.h are consistent with your point about map being a shorthand for a looping op-tree. In EXTCONST char* const PL_op_name[], not only does "map" have an op code name, but I also see two others: "mapstart" and "mapwhile". Are these the opcodes for the loop you are talking about? (sometimes header files can be deceiving if you don't know the code base well)

grep acts like map with regard to returns. It also seems to treat its block like a loop and, not surprisingly, it too has three op code names: "grep", "grepstart", and "grepwhile".

On the other hand, I'm thinking that sort may be implemented like something closer to function. Unlike "map" or "grep" it has only the one op-code "sort". Also, as mentioned earlier on this thread, it treats returns as if the block was an eval {} or anonymous subroutine. What is your take given your greater experience with internals?

Also, is there any guideline or rule of thumb that can be used to determine how routines listed in index-functions are going to treat a block? It seems like there ought to be something other than testing code samples, knowing internals, or word-of-mouth from other Perl programmers.

Best, beth

Replies are listed 'Best First'.
Re^3: map and return
by ikegami (Patriarch) on Sep 03, 2009 at 17:09 UTC

    My knowledge of internals is mostly limited to what B::Concise and Devel::Peek output. Fortunately, this falls within that realm.

    map's block is inlined:

    $ perl -MO=Concise,-exec -e'@b = map { foo() } @a' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <0> pushmark s 5 <#> gv[*a] s 6 <1> rv2av[t6] lKM/1 7 <@> mapstart lK*/2 8 <|> mapwhile(other->9)[t7] lK/1 9 <0> pushmark s a <#> gv[*foo] s/EARLYCV b <1> entersub[t4] lKS/TARG,1 - <@> scope lK goto 8 c <0> pushmark s d <#> gv[*b] s e <1> rv2av[t2] lKRM*/1 f <2> aassign[t8] vKS/COMMON g <@> leave[1 ref] vKP/REFC -e syntax OK

    Same for grep:

    $ perl -MO=Concise,-exec -e'@b = grep { foo() } @a' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <0> pushmark s 5 <#> gv[*a] s 6 <1> rv2av[t6] lKM/1 7 <@> grepstart lK*/2 8 <|> grepwhile(other->9)[t7] lK/1 9 <0> pushmark s a <#> gv[*foo] s/EARLYCV b <1> entersub[t4] sKS/TARG,1 - <@> scope sK goto 8 c <0> pushmark s d <#> gv[*b] s e <1> rv2av[t2] lKRM*/1 f <2> aassign[t8] vKS/COMMON g <@> leave[1 ref] vKP/REFC -e syntax OK

    For comparison, where's what a foreach loop looks like:

    $ perl -MO=Concise,-exec -e'for (@a) { foo() }' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark sM 4 <#> gv[*a] s 5 <1> rv2av[t2] sKRM/1 6 <#> gv[*_] s 7 <{> enteriter(next->c last->f redo->8) lKS d <0> iter s e <|> and(other->8) vK/1 8 <;> nextstate(main 1 -e:1) v 9 <0> pushmark s a <#> gv[*foo] s/EARLYCV b <1> entersub[t4] vKS/TARG,1 c <0> unstack v goto d f <2> leaveloop vK/2 g <@> leave[1 ref] vKP/REFC -e syntax OK

    Sort can call a sub, so it makes a sub from the block:

    $ perl -MO=Concise,-exec -e'@b = sort foo @a' 1 <0> enter 2 <;> nextstate(main 1 -e:1) v 3 <0> pushmark s 4 <0> pushmark s 5 <$> const[PV "foo"] s/BARE 6 <#> gv[*a] s 7 <1> rv2av[t4] lK/1 8 <@> sort lKS 9 <0> pushmark s a <#> gv[*b] s b <1> rv2av[t2] lKRM*/1 c <2> aassign[t5] vKS d <@> leave[1 ref] vKP/REFC -e syntax OK
    $ perl -MO=Concise,-exec -e'@b = sort { foo() } @a' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <0> pushmark s 5 <#> gv[*a] s 6 <1> rv2av[t6] lK/1 7 <@> sort lKS* --> I guess the * means the sub is 8 <0> pushmark s attached to the op rather than 9 <#> gv[*b] s found on the stack. a <1> rv2av[t2] lKRM*/1 b <2> aassign[t7] vKS c <@> leave[1 ref] vKP/REFC -e syntax OK

    Finally, "&" prototype in action:

    $ perl -MO=Concise,-exec -e'sub faker(&); faker { foo() }' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <0> pushmark sRM 5 <$> anoncode[CV ] lRM 6 <1> refgen KM/1 7 <#> gv[*faker] s 8 <1> entersub[t3] vKS/TARG,1 9 <@> leave[1 ref] vKP/REFC -e syntax OK
    $ perl -MO=Concise,-exec -e'sub faker(&); faker sub { foo() }' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <0> pushmark sRM 5 <$> anoncode[CV ] lRM 6 <1> refgen KM/1 7 <#> gv[*faker] s 8 <1> entersub[t3] vKS/TARG,1 9 <@> leave[1 ref] vKP/REFC -e syntax OK

    Are these the opcodes for the loop you are talking about?

    No, I meant the body of the curlies. In my examples, that would be the call to foo(). How can you pass "a call to foo()" to a sub? You can't. Perl puts it in an anon sub and passes a reference to that sub.

    is there any guideline or rule of thumb that can be used to determine how routines listed in index-functions are going to treat a block?

    Whenever possible, subs are avoided. They are expensive, especially when the alternative is just executing the next instruction.

    Think of it this way: If the body of the block is constant, it will be inlined. If it's not, it will be become a sub.

    • sort takes a sub for argument, so it can't be inlined. I guess it could inline sort BLOCK LIST and not sort SUBNAME LIST, but perl uses the same behaviour for both (documented).
    • eval's block acts like a sub (documented). I guess it's easier to implement exceptions that way.
    • sub creates a sub from the block. duh.
    • Every other block is inlined.

    The body of prototyped function isn't constant, so the variable part is placed in a sub and passed as a code ref.