in reply to Split does not behave like a subroutine

Sometimes there are contradictions. What applies to a list (list value), perhaps does not apply to a list (list value constructor)

Can you show some examples?

In the documentation often list is used for both list value and list value constructor. This make it difficult to understand the documentation. ... The list concept and for all flattening should be described more clearly in the Perl documentation. ... The risk to lose a Positional argument should be explicit clarified

I don't disagree that Perl's concept of lists can take some getting used to, and clarifying documentation is always useful. I should note though that, while Perl's documentation is often used as a reference, it's not always perfect, and I would suggest also looking into The Camel and similar books (e.g. Learning Perl, Modern Perl) to see if they help explain it better. In any case, documentation patches are usually a good thing; in my own experience, one might have to revise them a couple of times after feedback from P5P, but that should only help improve them.

Here I use: In a subroutine call the arguments are passed/bound to the parameters (formal argument) in the subroutine definition. The arguments for a call are evaluated, and the resulting values are passed to the corresponding parameters. I have always though, that the commas in a call to a subroutine, separate the argument list in sequence of values each corresponding to a parameter. call('P1', 'P2', 'P3'). There can be Positional parameters in a subroutine definition.

To me this seems to be the core of your question, and I have to say that what you write here is actually not Perl's concept. You quoted it yourself:

The Perl model for function call and return values is simple: all functions are passed as parameters one single flat list of scalars, and all functions likewise return to their caller one single flat list of scalars. Any arrays or hashes in these call and return lists will collapse, losing their identities ...

You're saying that in test( 'P1', nop1, 'P3' );, nop1 is a positional parameter, but the way Perl sees it is after evaluation. You may also note that perlsub makes no mention of "positional" (except in the section on the still-experimental signatures).

In conservative programming, there should be no subroutine calls in the argument list!?

It's something to be careful with, definitely; there have even been serious security issues related to this. There are workarounds though: Perl's scalar can be used to force a single scalar value, and there are extensions like PerlX::Maybe for pairs of arguments (which are like named parameters, though), and of course there's parameter validation using e.g. Type::Params. (I hesitate to mention it, because they should be used very sparingly and only when one knows what one is doing, but Prototypes can also be used to force scalar context on arguments.)

Split look like a subroutine but does not behave like one.

Correct, there are a few Perl functions that are exceptions and parsed differently from the rest. As LanX mentioned, they can generally be identified by prototype returning undef. The EXPR form of map and grep and similar (the BLOCK forms can be created using the & prototype). Note that split does behave like there is an implicit qr// on the pattern.

The meaning of the slashes in /PATTERN/ should be explained.

I'm not sure about this one, since it's just a regular expression like any other (perlop, perlretut, perlre). (Okay, one exception that I can think of off the top of my head: the empty pattern // is treated differently.)

Replies are listed 'Best First'.
Re^2: Split does not behave like a subroutine (prototype / updated)
by LanX (Saint) on Jul 18, 2020 at 14:21 UTC
    > Correct, there are a few Perl functions that are exceptions and parsed differently from the rest. As LanX mentioned, they can generally be identified by prototype returning undef.

    I think it's unfortunate that undef means ...

    • special parsing for CORE::builtins.
    • default LIST for other subs

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

    UPDATE

    Am I wrong or is there no difference between

    • no prototype
    • prototype (@)

    hence both equally and undistinguishable allowing to call func(LIST) ?

      I think it's unfortunate that undef means special parsing for CORE::builtins. / default LIST for other subs

      Yes, I agree, it's unfortunate. However, from a quick check it seems that all Perl builtins that accept a plain list explicitly have a @ prototype (like die, unlink, chown), and only those with special parsing return undef for their prototype.

      Am I wrong or is there no difference between no prototype / prototype (@)

      I think that's true, yes.

        > that accept a plain list explicitly have a @ prototype

        Oh, that's great!

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

Re^2: Split does not behave like a subroutine
by bojinlund (Monsignor) on Jul 22, 2020 at 14:35 UTC

    Thanks haukex for the answer!r

    Sometimes there are contradictions. ... Can you show some examples?

    My original problem was to understand the way from the subroutine call in the script source code (the list of expressions) to the list of parameters (formal argument) in the definition of the subroutine.

    LISTs do automatic interpolation of sublists. That is, when a LIST is evaluated, each element of the list is evaluated in list context, and the resulting list value is interpolated into LIST just as if each individual element were a member of LIST (from List value constructors)
    Like the flattened incoming parameter list, the return list is also flattened on return. (from perlsub)
    The Perl model for function call and return values is simple: all functions are passed as parameters one single flat list of scalars, and all functions likewise return to their caller one single flat list of scalars. Any arrays or hashes in these call and return lists will collapse, losing their identities --but you may always use pass-by-reference instead to avoid this. Both call and return lists may contain as many or as few scalar elements as you'd like. (from perlsub)

    Based on those quotes, learning a little about the Perl interpreter and using B::Deparse and B::Concise, I have learnt a more perlish way of thinking.

    • LIST is a list of expressions in the source code.
    • When LIST is evaluated the result is stored on the argument stack in the interpreter. During this evaluation the result is flatten.
    • This is the argument list, the input to the subroutine. This list contains references to the argument values.
    • The argument values can be accessed from the subroutine definition by using @_ and $_n. @_ is the list of argument values. $_n is the value of argument n.

    I have several times reread the parts of the documentation which cover the subroutine call. Have still problems to understand everything. I do not understand it so well that I can say there are contradictions

    Probably I am still thinking that the text describes what you see in the source and not the result after evaluation.

    The word "list" is often used. Often it is not clear to which list it refers. The word list is also part of the term "list value". Is "list values" many "list value" or is it the values in a list?

    Interface to split

    How or where can you find the restrictions on the arguments to split?

    This split $_[0], $_[1], $_[2]; and this split $_[0], @_[1,2]; should have given the same result ("a", "b", "c")!

    use strict; use warnings; use 5.010; use Path::Tiny qw( path ); use Data::Dump qw(dump dd ddx); use B::Concise qw(set_style add_callback walk_output); sub concise { my $case = shift; my $fh; say ''; my $walker = B::Concise::compile( '-src', '-basic', $case ); $fh = path( 'split_call_' . $case )->openw_utf8; #walk_output($fh); $walker->(); say ''; $walker = B::Concise::compile( '-src', '-exec', $case ); $walker->(); } my $pat = ':'; my $str = 'a:b:c'; my $limit = -1; my @par = ( $pat, $str, $limit ); if (1) { sub splitC { my @rv = split $_[0], $_[1], $_[2]; return \@rv; } ddx splitC(@par); concise('splitC'); } if (1) { sub splitC1 { my @rv = split $_[0], @_[1,2]; return \@rv; } ddx splitC1(@par); concise('splitC1'); } __DATA__ Part of output: # split_call.pl:34: ["a", "b", "c"] main::splitC: 8 <1> leavesub[1 ref] K/REFC,1 ->(end) - <@> lineseq KP ->8 # 30: my @rv = split $_[0], $_[1], $_[2]; 1 <;> nextstate(main 3 split_call.pl:30) v:*,&,{,x*,x&,x$,$,fea +=1 ->2 4 </> split(/":"/ => @rv:3,4)[t5] vK/LVINTRO,ASSIGN,LEX ->5 3 <|> regcomp(other->4) sK ->9 - <1> ex-aelem sK/2 ->3 - <1> ex-rv2av sKR/STRICT,1 ->- 2 <#> aelemfast[*_] s ->3 - <0> ex-const s ->- - <1> ex-aelem sK/2 ->a - <1> ex-rv2av sKR/STRICT,1 ->- 9 <#> aelemfast[*_] s/key=1 ->a - <0> ex-const s ->- - <1> ex-aelem sK/2 ->4 - <1> ex-rv2av sKR/STRICT,1 ->- a <#> aelemfast[*_] s/key=2 ->4 - <0> ex-const s ->- # 31: return \@rv; 5 <;> nextstate(main 4 split_call.pl:31) v:*,&,{,x*,x&,x$,$,fea +=1 ->6 - <@> return K ->- - <0> pushmark s ->6 7 <1> srefgen sK/1 ->8 - <1> ex-list lKRM ->7 6 <0> padav[@rv:3,4] lRM ->7 # split_call.pl:45: [-1] main::splitC1: i <1> leavesub[1 ref] K/REFC,1 ->(end) - <@> lineseq KP ->i # 41: my @rv = split $_[0], @_[1,2]; b <;> nextstate(main 10 split_call.pl:41) v:*,&,{,x*,x&,x$,$,fe +a=1 ->c e </> split(/":"/ => @rv:10,11)[t5] vK/LVINTRO,ASSIGN,LEX,IMPLI +M ->f d <|> regcomp(other->e) sK ->j - <1> ex-aelem sK/2 ->d - <1> ex-rv2av sKR/STRICT,1 ->- c <#> aelemfast[*_] s ->d - <0> ex-const s ->- o <@> aslice sK ->p j <0> pushmark s ->k - <1> ex-list lK ->m - <0> ex-pushmark s ->k k <$> const[IV 1] s ->l l <$> const[IV 2] s ->m n <1> rv2av[t4] sKR/STRICT,1 ->o m <#> gv[*_] s ->n p <$> const[IV 0] s ->e # 42: return \@rv; f <;> nextstate(main 11 split_call.pl:42) v:*,&,{,x*,x&,x$,$,fe +a=1 ->g - <@> return K ->- - <0> pushmark s ->g h <1> srefgen sK/1 ->i - <1> ex-list lKRM ->h g <0> padav[@rv:10,11] lRM ->h
      The word "list" is often used. Often it is not clear to which list it refers.

      I don't normally think about lists in a way as complicated as you are here, but for initial learning that's probably fine (Update: to clarify: it's fine as a learning method). Personally, I think about "lists" in Perl as there being only one kind of list, and it's a somewhat loose term that can refer to argument lists (which the subroutine ends up seeing as an array), list value constructors, and return values. AFAICT, your description in your node seems to fit "lists" in general pretty well, so unfortunately I'm not quite sure what your specific question is here?

      sub foo { my @x = ("a", @_, "b"); # interpolation return "x", @x, "y"; # interpolation } my @y = ("i", "j"); my @z = foo("r", @y, "s"); # interpolation # @z is ("x", "a", "r", "i", "j", "s", "b", "y") # also: comma operator in scalar context via "return" my $x = foo("u", "v"); # $x is "y" !!
      This list contains references to the argument values.

      Just to nitpick this, note that these are not "references" in the sense of hard references described in perlref. They are more commonly referred to as aliases.

      How or where can you find the restrictions on the arguments to split?

      There is a huge caveat to the "arguments to subs are flattened": Prototypes, which I suggest you read up on. These change the way the function call is parsed, and this can include forcing arguments that would normally be flattened / interpolated, like in your case @_[1,2], to in fact be taken as if they have an implicit scalar on them.

      Builtin functions can indeed sometimes be a little confusing in this respect, because they are parsed as with prototypes, even if the prototypes are never stated explicitly for many functions. In fact, the use of prototypes is otherwise generally discouraged because of their often confusing effects on how function calls are parsed, but in the Perl core I believe they have historic significance. The $ prototype is probably the closest equivalent to what's going on with split:

      sub mysplit1 { ... } sub mysplit2 ($$;$) { ... } my @x = ("a:b:c", -1); mysplit1(":", @x); # arguments to mysplit1 are flattened mysplit2(":", @x); # parsed like mysplit2(":", scalar(@x)) !!! &mysplit2(":", @x); # prototype is ignored, arguments are flattened!

      Minor typo fixes.

      Sorry, IMHO are you over-complicating.

      Your definition of LIST is too narrow , it's not only used for function(LIST) and "comma" is not the only list constructor.

      You can find LIST in docs for other cases too.

      For me LIST means a piece of code which ...

      • is compiled in list context
      • returns a list value
      nothing more.

      for instance map {BLOCK} LIST ...

      • map { uc } qw/a b c/
      • map { uc } "a" .. "c"
      • map { uc } grep {...} ...
      • map { uc } "a", "b", "c"
      • map { uc } @a
      only the last two list constructors allow "list interpolation/flattening", since it's a feature of the "comma operator" which has two variants , and '=>'.

      correction: since it's a feature of the naked "list context" w/o operator, what comma does is just propagating the list context down the tree, hence @a=@b,@c is just @a = (@b),(@c)

      FWIW: I use "interpolation" primarily for vars in strings like in print "$a $b";

      But glossary lists both variants

      • interpolation

        The insertion of a scalar or list value somewhere in the middle of another value, such that it appears to have been there all along. In Perl, variable interpolation happens in double-quoted strings and patterns, and list interpolation occurs when constructing the list of values to pass to a list operator or other such construct that takes a LIST.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery