Dereferencing of built-ins with crappy names

hurricup has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Dereferencing of built-ins with crappy names (precedence override) by tye (Sage) on Jul 10, 2015 at 21:11 UTC
See p5git://perly.y which contains the following lines (not consecutively): `scalar : '$' indirob ; indirob : WORD \| scalar %prec PREC_LOW \| block \| PRIVATEREF ;` [download] It may be that the only reason that `$$;` is not parsed as a `'$'` followed by an `indirob` (that is a `scalar`, namely `$;`) is because the "`%prec PREC_LOW`" causes another interpretation to be preferred (the shift/reduce conflict is resolved in the other direction). On the other hand, the tokenizer has a lot of power for this type of stuff. I believe the part of p5git://toke.c that is relevant is `scan_ident()` (see "S_scan_ident") which will often be called in a `case '$':` section, for example, in `Perl_yylex()`. In particular, see after the line starting "`/* Is the byte 'd' a legal single character identifier name?`". Which eventually, for this case, leads to almost the last line of that function: `else if (PL_lex_state == LEX_INTERPNORMAL && !PL_lex_brackets && ! +intuit_more(s))` [download] Which leads one to read `intuit_more()` (see "S_intuit_more") which is one of my favorite parts of the tokenizer. Be sure to read the comments above that routine. But since `$$[` doesn't get parsed as `${$[}`, I don't believe that `intuit_more()` is to blame which makes me think that none of the toke.c code is to blame. Which makes me believe that `scan_ident()` would return `';'` if the lexer called it after finding `'$$'`. Which leads me back to "`%prec PREC_LOW`". But (as you saw if you followed along), this code is not simple so I'm not actually confident that this is the right answer. You could try changing perly.y and build a new `perl` and see. - tye	[reply] [d/l] [select]
Re: Dereferencing of special variables by LanX (Saint) on Jul 10, 2015 at 18:46 UTC
Deferencing only makes sense with variables holding references. Why should stuff like $SUBSCRIPT_SEPARATOR or $EFFECTIVE_GROUP_ID ever be dereferenced? If you still need valid info write a test script checking all special variables. I wouldn't be surprised if older perl versions differ here. Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!} PS: try a more neutral wording of thread titles please.	[reply]
Re^2: Dereferencing of special variables by hurricup (Pilgrim) on Jul 10, 2015 at 18:54 UTC
Of course there is a (kinda) logic in parsing rules. And of course rules could change between versions. That is why the best solution here is to point to specific place in Perl sources.	[reply]
Re: Dereferencing of special variables by Eily (Monsignor) on Jul 10, 2015 at 18:49 UTC
I guess the syntax $$name doesn't "confuse" the parser when name is of the form \w+, while any punctuation variable will look like the variable $$ followed by an operator or other token. `$$;` is parsed as `$$ ;`, `$$)` as `$$ )` etc, but _ is not a perl built-in specific character (and is a valid character to start a variable name in every language I can think of) so $$_ is understood like any other $$name variable.	[reply] [d/l] [select]
Re^2: Dereferencing of built-ins with crappy names by hurricup (Pilgrim) on Jul 10, 2015 at 19:02 UTC
Seems you are right. Even $/ variable which can be used with reference, can't be plainly dereferenced. Want to plus you, but no votes left :( Thanks!	[reply]
Re: Dereferencing of built-ins with crappy names by 1nickt (Canon) on Jul 10, 2015 at 18:10 UTC
You can always refer to any variable with all its braces: `$$foo ${ $foo } # same thing my $bar = 'baz'; print "$barquux"; # error print "${bar}quux";` [download] update: added second example Remember: Ne dederis in spiritu molere illegitimi!	[reply] [d/l]
Re^2: Dereferencing of built-ins with crappy names by hurricup (Pilgrim) on Jul 10, 2015 at 18:26 UTC
Yes, I know that. The question is more about perl parsing. Will add this, thanks.	[reply]
Re: Dereferencing of built-ins with crappy names by 1nickt (Canon) on Jul 10, 2015 at 18:31 UTC
SPECIAL VARIABLES Remember: Ne dederis in spiritu molere illegitimi!	[reply] [d/l]
Re^2: Dereferencing of built-ins with crappy names by 1nickt (Canon) on Jul 10, 2015 at 18:44 UTC
Gosh, I'm sorry that was worth a negative vote from you. Was it really a coincidence that all the examples you gave of variables that you couldn't dereference are included on the list of Perl special variables? Since you dislike them so much, I'm sure there's still time for you to get on over to perl6 and volunteer some time rewriting the core so the internals don't have such "crappy" names ... Remember: Ne dederis in spiritu molere illegitimi!	[reply] [d/l]
Re^3: Dereferencing of built-ins with crappy names by hurricup (Pilgrim) on Jul 10, 2015 at 18:51 UTC
Too much for such question :) I'm working on parser exactly for Perl5, not 6 (yet) and need to know this :) And no, it's not a coincidence, maybe i've just used a bad word "crappy names". I'm talking about let's say unusual variables, not <sigil><basic_identifier>. Let's say ambiguous ones.	[reply]
Re^2: Dereferencing of built-ins with crappy names by hurricup (Pilgrim) on Jul 10, 2015 at 18:38 UTC
I need to know which of them can't be plainly dereferenced because perl parser says so. Not list of all built-ins.	[reply]
Re: Dereferencing of built-ins with crappy names by Anonymous Monk on Jul 10, 2015 at 19:40 UTC
Well it appears to be documented actually. Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a simple scalar variable containing a reference of the correct type - perlref About identifiers: If working under the effect of the "use utf8;" pragma, the following rules apply: `/ (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) (?[ ( \p{Word} & \p{XID_Continue} ) ]) * /x` [download] That is, a "start" character followed by any number of "continue" characters. Perl requires every character in an identifier to also match "\w" (this prevents some problematic cases); and Perl additionally accepts identfier names beginning with an underscore. If not under "use utf8", the source is treated as ASCII + 128 extra controls, and identifiers should match `/ (?aa) (?!\d) \w+ /x` [download] - perldata, "Identifier parsing"	[reply] [d/l] [select]
Re^2: Dereferencing of built-ins with crappy names by Anonymous Monk on Jul 10, 2015 at 19:45 UTC
Come to think of it, that's not about it at all. Anyway, hopefully the rules of identifier parsing will be useful for your project...	[reply]