Re: Clarifying the Comma Operator (bareword)
by tye (Sage) on Jun 07, 2009 at 04:32 UTC
|
Excellent suggestion. I wouldn't use "identifier", as that suggests other connotations to me and doesn't fit how I'm used to seeing Perl documented.
I would change "any word" to "a word". I find "any word" potentially misleading, as it could be interpretted as applying to words that are part of some larger expression to the left of the fat comma.
My first choice was to change "any word" to "a bareword", but the definition given for "bareword" in perldata is actually too narrow for this case. I define "bareword" as a "bare word", an unadorned word that Perl first tries to interpret as an operator or function call and then resorts to quoting if strict.pm doesn't prevent it. perldata defines "bareword" as "a bare word that doesn't mean something else", which I find unfortunate but acceptable.
Note that elsewhere in perldata it says:
The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges for its left-hand operand to be interpreted as a string -- if it's a bareword that would be a legal simple identifier (=> doesn't quote compound identifiers, that contain double colons).
which isn't using the "bareword" definition found in that same document (otherwise it would mean that x, for example, won't be turned into a string by => since x isn't a "bareword" according to the perldata documentation).
Update: Actually, even though x is an operator, it can also be a "bareword" that just gets quoted if strict.pm isn't in effect. Perhaps "if", "s", or "q" would've been better examples. I'm not absolutely certain that an unadorned ("bare") instance of "if", "s", or "q" can never turn into a bareword that gets stringified, but I can't think of any counterexamples. For example, try replacing "x" with other things in perl -MO=Deparse -e "0+x" vs in perl -MO=Deparse -e "print x=>0".
perldata also says:
In fact, an identifier within such curlies is forced to be a string, as is any simple identifier within a hash subscript. Neither need quoting.
just to note how the same situation is described for another case.
Finally, my example of x reminds me that the clarification should be expanded to "considered an operator, constant, or function call".
| [reply] [d/l] [select] |
|
|
it also arranges for its left-hand operand to be interpreted as a string Actually, that's where the problem lies: 08 is not seen as a string but (wrongly) as an octal number. Perl tries to numify this "word" whereas it should be stringified.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] |
|
|
Hmm. You'll have to try much harder to explain your point. I'm fully aware of the facts you repeat above, but I'm completely lost as to why you felt it an appropriate response to my node. I also disagree with your use of "wrongly" and "should". You also appear to disagree with your own uses of "wrongly" and "should" in other replies in these threads.
The rest of that (complex) sentence that you partially quoted includes:
if it's a bareword that would be a legal simple identifier
And 08 isn't a legal simple indentifier (and thus there is no "problem"). Nor is 08 a bareword (by either definition). And I know you know this because elsewhere you noted this (though your inclusion of "hyphen" in your explanation was erroneous).
| [reply] |
|
|
|
|
foo::bar => baz
perl is expecting a TERM. Barewords can start terms (infix operators like the x tye mentions cannot - that's why infix operators can consist of letters, but prefix operators cannot), and that's why all barewords look like identifiers. Because only after tokenizing the bareword and looking at the next token, perl decides whether the bareword is an identifier (subroutine, filehandle), or a string. | [reply] [d/l] |
|
|
Re: Clarifying the Comma Operator
by graff (Chancellor) on Jun 07, 2009 at 02:22 UTC
|
In response to your suggestion:
The "=>" operator is a synonym for the comma, but forces any word (beginning with an underscore or alphabetic character and consisting entirely of word characters) to its left to be interpreted as a string (as of 5.001). This includes words that might otherwise be considered a constant or function call but does not include numeric literals.
I think saying it like this would make it easier to understand:
The "=>" operator is a synonym for the comma, but forces any word to its left to be interpreted as a string (as of 5.001); this includes words that might otherwise be considered a constant or function call. Here, a "word" is anything beginning with an underscore, hyphen or alphabetic letter and consisting entirely of word characters (see description of "\w" in perlre). Any "word" that can be interpreted as a number will be converted to a string after evaluating its numeric value (e.g. "1.20" and "-034" to the left of "=>" will yield the strings "1.2" and "-34" "-28", respectively -- the latter involves an octal-to-decimal conversion).
UPDATE: Thanks to Porculus for pointing out the error with -034. As for what CountZero said, I think a negative number is consistent with the definition, so the example is relevant and useful. While "1.20" in fact fails to match the definition, I thought it was better to include it as an example anyway, to demonstrate the effect -- as well as the fact that stepping outside the definition in this way does not cause a compile-time error.
(And I just noticed what happens when you do: %h = ( 7-6 => "one", 1+1 => "two" ), which some might consider to be a useful feature, but would need to be used with care...)
| [reply] [d/l] |
|
|
Your number examples do not fall within the definition of "word" as they do not begin with an underscore, hyphen or alphabetic letter.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] |
|
|
| [reply] |
|
|
Any "word" that can be interpreted as a number will be converted to a string after evaluating its numeric value (e.g. "1.20" and "-034" to the left of "=>" will yield the strings "1.2" and "-34", respectively).
No, it doesn't.
perl -MDevel::Peek -we '@f = (1.20 => "1.2"); Dump $f[0]; Dump $f[1]'
SV = NV(0x88238e8) at 0x87fc230
REFCNT = 1
FLAGS = (NOK,pNOK)
NV = 1.2
SV = PV(0x87f9044) at 0x87fc424
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x8806fc8 "1.2"\0
CUR = 3
LEN = 4
As you can see, the 1.20 is not a string (there's no PV value, nor is POK or pPOK set), while "1.2" is a string (and not a number). | [reply] [d/l] |
|
|
In ( 1.20 => 'foo' ) I'm not saying that the "1.20" is a string. The point is that in order to build a string from a "word" like 1.20 that sits to the left of "=>", perl determines its numeric value (NV), which it then "stringifies" to the simplest possible form ("1.2") -- and this is what the OP was struggling with.
| [reply] [d/l] |
Re: Clarifying the Comma Operator
by JavaFan (Canon) on Jun 07, 2009 at 00:23 UTC
|
Actually, I didn't suggest identifier should replace word. I said any that looks like an identifier. Note that perldata in its second paragraph already defines how identifiers look like.
but I worry that identifier might have limitations that would not be appropriate.
Don't. Think about it. When parsing, on encountering a possible 'bare word', the parser doesn't look ahead and say "hmmm, let me see, if the token following this bare word thingy isn't a fat comma, I'm going to parse it as an identifier, because it's got to be a subroutine or a file/dir handle, but if it isn't followed by a fat comma, I'm going to accept something slightly different". No, the parser is going to parse something that may be a valid identifier. Then it looks whether it's in autoquoting context (next token is a fat comma, or we're indexing in a hash and the next token is a close brace), in which case the thing just parsed is a string, otherwise, it is indeed an identifier (subroutine, file/dir handle). | [reply] |
|
|
My apologies for expressing my idea as yours, and I didn't even fully appreciate your earlier comment. With your clarification, rethinking and rereading perldata, I think I get it.
I am now inclined to replace word with word that looks like an identifier. I think this is very consistent with what perl actually does and provides a reasonably clear and concise description.
As for the bit in parentheses: it would then be redundant with the definition of identifier in perldata. So the question is: what's worse: a redundant statement or making users jump between documents?
| [reply] |
Re: Clarifying the Comma Operator
by Bloodnok (Vicar) on Jun 07, 2009 at 00:10 UTC
|
I'm definitely in favour of your suggested re-wording since the confusion in the post you identify strongly suggests that further clarification is necessar.
Altho', having said that and in a similar vein to you, I'm struggling to find suitable wordage that's both fully descriptive and comprehensible to non-native English speakers.
A user level that continues to overstate my experience :-))
| [reply] |
Re: Clarifying the Comma Operator
by duelafn (Parson) on Jun 07, 2009 at 11:48 UTC
|
Seems like an explicit pattern is easy enough:
The "=>" operator is a synonym for the comma, but forces any word (anything matching /^\w+$/ that is not a numeric literal) to its left to be interpreted as a string (as of 5.001). This includes words that might otherwise be considered a constant or function call.
| [reply] |
|
|
$ perl -E'say 08z => 1'
Illegal octal digit '8' at -e line 1, at end of line
syntax error at -e line 1, near "08z"
Execution of -e aborted due to compilation errors.
| [reply] [d/l] [select] |
Re: Clarifying the Comma Operator
by Marshall (Canon) on Jun 06, 2009 at 23:43 UTC
|
I tried to update my post in previous thread, but there were some problems. I'm not sure presentation is correct and for some reason the server won't let me edit my post yet again. I don't know why. I probably generated something un-parseable for the markup language by pure accident.
Anyway this wording, "The "=>" operator is a synonym for the comma, but forces any word... would appear to me to mean that somehow => and "," are different? I don't think so. Maybe and is the right conjunction?
| [reply] |
|
|
This is an interesting point.
I looked at toke.c in the perl source and found that OPERATOR(',') is generated for both "," and "=>". I jumped to the conclusion that they must be identical. But then I tried the following:
use strict;
use warnings;
use Data::Dumper;
use constant FOO => "something";
my %h = ( FOO => 23, FOO , 24 );
print Dumper(\%h);
__END__
$VAR1 = {
'FOO' => 23,
'something' => 24
};
So they are in fact different, despite the same token being returned from toke.c. Something higher up must look again at what the input text was to decide what to do, but I don't know where this happens.
update: The distinction is in toke.c, not when scanning "," or "=>" but when scanning a word. After scanning a word, toke.c looks ahead and, if it sees "=>" it returns TERM(WORD). | [reply] [d/l] |
|
|
Yes, this is interesting as "=>" and "," are not interchangeable. The net of this is that this question is one heck of a lot more complex than it sounded at first! If thing X is described as a "delta" from thing Y ("but"). Is Y well defined? And is it possible to clearly understand what X will do even given that you understand Y? (e.g what the "but" delta means). As another thought, if defining the complete behavior is too complex, is there some subset that can be described that would be generally useful?
| [reply] |
|
|
| [reply] |
|
|
|
|
Well, => and , are different. The difference is exactly the but part.
| [reply] [d/l] [select] |
Re: Clarifying the Comma Operator
by Anonymous Monk on Jun 07, 2009 at 00:12 UTC
|
I think examples would be good C:\>perl -le " print for w0rd => _w0rd => -w0rd => 07 => 1 "
w0rd
_w0rd
-w0rd
7
1
C:\>perl -le " print for 0word => 1 "
syntax error at -e line 1, near "0word"
Execution of -e aborted due to compilation errors.
C:\>perl -le " print for 08 => 1 "
Illegal octal digit '8' at -e line 1, at end of line
Execution of -e aborted due to compilation errors.
| [reply] [d/l] |
Re: Clarifying the Comma Operator
by shmem (Chancellor) on Jun 07, 2009 at 21:47 UTC
|
I'd just say: When in doubt, quote.
| [reply] |