hurricup has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, need your help again.

Today I've discovered, that I can write $$_ to dereference $_, but can't write $$; to deref $; or $$) for $).

Is there any list or Perl source file where I could find rules for which can be dereferenced and which not?

I know that I can just type them all and check, but would be better to find the source. Thanks.

NB: I do know that i can do it with braces. Question is about perl parsing.

  • Comment on Dereferencing of built-ins with crappy names

Replies are listed 'Best First'.
Re: Dereferencing of built-ins with crappy names (precedence override)
by tye (Sage) on Jul 10, 2015 at 21:11 UTC

    See p5git://perly.y which contains the following lines (not consecutively):

    scalar : '$' indirob ; indirob : WORD | scalar %prec PREC_LOW | block | PRIVATEREF ;

    It may be that the only reason that $$; is not parsed as a '$' followed by an indirob (that is a scalar, namely $;) is because the "%prec PREC_LOW" causes another interpretation to be preferred (the shift/reduce conflict is resolved in the other direction).

    On the other hand, the tokenizer has a lot of power for this type of stuff. I believe the part of p5git://toke.c that is relevant is scan_ident() (see "S_scan_ident") which will often be called in a case '$': section, for example, in Perl_yylex().

    In particular, see after the line starting "/* Is the byte 'd' a legal single character identifier name?". Which eventually, for this case, leads to almost the last line of that function:

    else if (PL_lex_state == LEX_INTERPNORMAL && !PL_lex_brackets && ! +intuit_more(s))

    Which leads one to read intuit_more() (see "S_intuit_more") which is one of my favorite parts of the tokenizer. Be sure to read the comments above that routine.

    But since $$[ doesn't get parsed as ${$[}, I don't believe that intuit_more() is to blame which makes me think that none of the toke.c code is to blame. Which makes me believe that scan_ident() would return ';' if the lexer called it after finding '$$'.

    Which leads me back to "%prec PREC_LOW". But (as you saw if you followed along), this code is not simple so I'm not actually confident that this is the right answer. You could try changing perly.y and build a new perl and see.

    - tye        

Re: Dereferencing of special variables
by LanX (Saint) on Jul 10, 2015 at 18:46 UTC
    Deferencing only makes sense with variables holding references.

    Why should stuff like $SUBSCRIPT_SEPARATOR or $EFFECTIVE_GROUP_ID ever be dereferenced?

    If you still need valid info write a test script checking all special variables.

    I wouldn't be surprised if older perl versions differ here.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

    PS: try a more neutral wording of thread titles please.

      Of course there is a (kinda) logic in parsing rules. And of course rules could change between versions. That is why the best solution here is to point to specific place in Perl sources.

Re: Dereferencing of special variables
by Eily (Monsignor) on Jul 10, 2015 at 18:49 UTC

    I guess the syntax $$name doesn't "confuse" the parser when name is of the form \w+, while any punctuation variable will look like the variable $$ followed by an operator or other token. $$; is parsed as $$ ;, $$) as $$ ) etc, but _ is not a perl built-in specific character (and is a valid character to start a variable name in every language I can think of) so $$_ is understood like any other $$name variable.

      Seems you are right. Even $/ variable which can be used with reference, can't be plainly dereferenced.

      Want to plus you, but no votes left :( Thanks!

Re: Dereferencing of built-ins with crappy names
by 1nickt (Canon) on Jul 10, 2015 at 18:10 UTC

    You can always refer to any variable with all its braces:

    $$foo ${ $foo } # same thing my $bar = 'baz'; print "$barquux"; # error print "${bar}quux";

    update: added second example

    Remember: Ne dederis in spiritu molere illegitimi!

      Yes, I know that. The question is more about perl parsing. Will add this, thanks.

Re: Dereferencing of built-ins with crappy names
by 1nickt (Canon) on Jul 10, 2015 at 18:31 UTC

      Gosh, I'm sorry that was worth a negative vote from you. Was it really a coincidence that all the examples you gave of variables that you couldn't dereference are included on the list of Perl special variables?

      Since you dislike them so much, I'm sure there's still time for you to get on over to perl6 and volunteer some time rewriting the core so the internals don't have such "crappy" names ...

      Remember: Ne dederis in spiritu molere illegitimi!

        Too much for such question :)

        I'm working on parser exactly for Perl5, not 6 (yet) and need to know this :)

        And no, it's not a coincidence, maybe i've just used a bad word "crappy names". I'm talking about let's say unusual variables, not <sigil><basic_identifier>. Let's say ambiguous ones.

      I need to know which of them can't be plainly dereferenced because perl parser says so. Not list of all built-ins.

Re: Dereferencing of built-ins with crappy names
by Anonymous Monk on Jul 10, 2015 at 19:40 UTC
    Well it appears to be documented actually.
    Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a simple scalar variable containing a reference of the correct type - perlref
    About identifiers:
    If working under the effect of the "use utf8;" pragma, the following rules apply:
    / (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) (?[ ( \p{Word} & \p{XID_Continue} ) ]) * /x
    That is, a "start" character followed by any number of "continue" characters. Perl requires every character in an identifier to also match "\w" (this prevents some problematic cases); and Perl additionally accepts identfier names beginning with an underscore. If not under "use utf8", the source is treated as ASCII + 128 extra controls, and identifiers should match
    / (?aa) (?!\d) \w+ /x
    - perldata, "Identifier parsing"
      Come to think of it, that's not about it at all. Anyway, hopefully the rules of identifier parsing will be useful for your project...