LanX has asked for the wisdom of the Perl Monks concerning the following question:

We seem to have a bug when parsing the // operator ...

This works

sub block(&) { return undef; } print block {'whatever'} || "is undef";

This doesn't

sub block(&) { return undef; } print block {'whatever'} // "is undef"; __END__ String found where operator expected at bug.pl line 6, near "// "is un +def"" (Missing operator before "is undef"?) Too many arguments for main::block at bug.pl line 6, near "// "is unde +f"" syntax error at bug.pl line 6, near "// "is undef"" Execution of bug.pl aborted due to compilation errors.

Tested with 5.14 and 5.22 ... Insights?

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

Replies are listed 'Best First'.
Re: BUG when parsing defined-or ?
by Anonymous Monk on Jan 25, 2016 at 02:32 UTC
    The anonymonk above is correct. Perl's lexer doesn't expect an operator in this position (although it can handle "unexpected" operators). And this is done in the lexer, not the parser. Specifically, in Perl_yylex (toke.c):
    case '/': /* may be division, defined-or, or pattern */ if ((PL_expect == XOPERATOR || PL_expect == XTERMORDORDOR) && + s[1] == '/') { ... AOPERATOR(DORDOR); } else if (PL_expect == XOPERATOR) { ... Mop(OP_DIVIDE); } else { ... s = scan_pat(s,OP_MATCH); }
    I'm no expert in how the lexer works but implementing parsing on such heuristics is a fragile thing.
    Oh yes, Perl's lexer is fascinating :) I'm not an expert either, life is too short...

    For example, you said that the concat op works...

    case '.': ... if (PL_expect == XOPERATOR || !isDIGIT(s[1])) { ... Aop(OP_CONCAT)
    So, if the lexer doesn't expect an operator, it decides based on whether the next character is a digit... Let's see:
    $ perl -MO=Deparse -e 'sub foo(&) {}; foo { "foo" } . 5' sub foo (&) { } foo(sub { 'foo'; } ) . 5; -e syntax OK $ perl -MO=Deparse -e 'sub foo(&) {}; foo { "foo" } .5' Too many arguments for main::foo at -e line 1, at EOF -e had compilation errors. sub foo (&) { } &foo(sub { 'foo'; } , 0.5);
    but:
    $ perl -E 'say 1 .5' 15
    because here it does expect an operator.

    So, yeah... Just don't think too much about it :)

    Hopefully this can be fixed in the future.
    That would require fixing the grammar of the Perl 5 programming language; the major version surely should be incremented then and the backwards-incompatible result should be called Perl 6... oh wait...
      > That would require fixing the grammar of the Perl 5 programming language; the major version surely should be incremented then and the backwards-incompatible result should be called Perl 6... oh wait...

      I mainly upvoted because of this pun! xD

      But the backwards-incompatibility resulting from fixing this should only affect previously not compilable code.

      And the grammar doesn't need to be changed, just the analysing technique improved.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

Re: BUG when parsing defined-or ?
by BrowserUk (Patriarch) on Jan 24, 2016 at 22:32 UTC
    Insights?

    Precedence.

    The sub (&), form attempts to parse what follows the block into a list of values to pass to the subroutine, thus it is attempting to parse // "is undef" before it calls the sub and as there is no lvalue to a binary operator, it crunches.

    You need parens -- either around the arguments for the sub, or around the entire subcall -- to ensure that the returned value from the sub becomes the lvalue to the operator.

    sub block(&) { return undef; };; print block {'whatever'} // "is undef";; String found where operator expected at (eval 10) line 1, near "// "is + undef"" (Missing operator before "is undef"?) [Too many arguments for main::block at (eval 10) line 1, near "// "is +undef"" syntax error at (eval 10) line 1, near "// "is undef"" print +( block {'whatever'} ) // "is undef";; is undef print block( sub{'whatever'} ) // "is undef";; is undef "is undef";;

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
      > The sub (&), form attempts to parse what follows the block into a list of values to pass to the subroutine,

      this shouldn't be!

      The prototype is explicitly declaring that only one value (i.e. the block) is passed to the sub.

      > as there is no lvalue to a binary operator, it crunches.

      || is a binary operator too and according to perldoc it has the same precedence.

      and as demonstrated || works as expected.

      Simply phrased: please explain: why || and not //.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

Re: BUG when parsing defined-or ?
by Anonymous Monk on Jan 25, 2016 at 00:03 UTC

    This is really just a guess, but maybe the logic that disambiguates between defined-or and an empty m// regex (which I remember reading somewhere is fuzzy) is choosing to interpret it as a regex.

    $ perl -MO=Deparse -e 'sub foo (&) {}; foo {"foo"} //' Too many arguments for main::foo at -e line 1, at EOF -e had compilation errors. sub foo (&) { } &foo(sub { 'foo'; } , //);
      That was my first idea too ...

      ... but I think something is fundamentally broken with the parsing of prototyped sub calls.

      see how + fails ...

      lanx@lanx-1005HA:/tmp$ perl -MO=Deparse -e 'sub foo (&) {}; foo {"foo" +} + 3' Too many arguments for main::foo at -e line 1, at EOF -e had compilation errors. sub foo (&) { } &foo(sub { 'foo'; } , 3);
      ... while . works
      lanx@lanx-1005HA:/tmp$ perl -MO=Deparse -e 'sub foo (&) {}; foo {"foo" +} . "3"' sub foo (&) { } foo(sub { 'foo'; } ) . '3'; -e syntax OK

      Seems like the parser tries to guess if a second parameter follows even if no further parameters are allowed.

      The + fails because it's also a unary operator, so the foo( BLOCK, +3 ) interpretation is preferred, but later rejected, without backtracking into another, binary interpretation.

      I'm no expert of how the lexer works but implementing parsing on such heuristics is a fragile thing.

      Hopefully this can be fixed in the future.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

      update

      that's really a depressing bug, I had much hope in using more block prototypes.