Clarifying the Comma Operator

ig has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Clarifying the Comma Operator (bareword) by tye (Sage) on Jun 07, 2009 at 04:32 UTC
Excellent suggestion. I wouldn't use "identifier", as that suggests other connotations to me and doesn't fit how I'm used to seeing Perl documented. I would change "any word" to "a word". I find "any word" potentially misleading, as it could be interpretted as applying to words that are part of some larger expression to the left of the fat comma. My first choice was to change "any word" to "a bareword", but the definition given for "bareword" in perldata is actually too narrow for this case. I define "bareword" as a "bare word", an unadorned word that Perl first tries to interpret as an operator or function call and then resorts to quoting if strict.pm doesn't prevent it. perldata defines "bareword" as "a bare word that doesn't mean something else", which I find unfortunate but acceptable. Note that elsewhere in perldata it says: The `=>` operator is mostly just a more visually distinctive synonym for a comma, but it also arranges for its left-hand operand to be interpreted as a string -- if it's a bareword that would be a legal simple identifier (`=>` doesn't quote compound identifiers, that contain double colons). which isn't using the "bareword" definition found in that same document (otherwise it would mean that `x`, for example, won't be turned into a string by `=>` since `x` isn't a "bareword" according to the perldata documentation). Update: Actually, even though `x` is an operator, it can also be a "bareword" that just gets quoted if strict.pm isn't in effect. Perhaps "if", "s", or "q" would've been better examples. I'm not absolutely certain that an unadorned ("bare") instance of "if", "s", or "q" can never turn into a bareword that gets stringified, but I can't think of any counterexamples. For example, try replacing "x" with other things in `perl -MO=Deparse -e "0+x"` vs in `perl -MO=Deparse -e "print x=>0"`. perldata also says: In fact, an identifier within such curlies is forced to be a string, as is any simple identifier within a hash subscript. Neither need quoting. just to note how the same situation is described for another case. Finally, my example of `x` reminds me that the clarification should be expanded to "considered an operator, constant, or function call". - tye	[reply] [d/l] [select]
Re^2: Clarifying the Comma Operator (bareword) by CountZero (Bishop) on Jun 07, 2009 at 07:10 UTC
it also arranges for its left-hand operand to be interpreted as a string Actually, that's where the problem lies: `08` is not seen as a string but (wrongly) as an octal number. Perl tries to numify this "word" whereas it should be stringified. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply] [d/l]
Re^3: Clarifying the Comma Operator (should) by tye (Sage) on Jun 07, 2009 at 07:47 UTC
Hmm. You'll have to try much harder to explain your point. I'm fully aware of the facts you repeat above, but I'm completely lost as to why you felt it an appropriate response to my node. I also disagree with your use of "wrongly" and "should". You also appear to disagree with your own uses of "wrongly" and "should" in other replies in these threads. The rest of that (complex) sentence that you partially quoted includes: if it's a bareword that would be a legal simple identifier And 08 isn't a legal simple indentifier (and thus there is no "problem"). Nor is 08 a bareword (by either definition). And I know you know this because elsewhere you noted this (though your inclusion of "hyphen" in your explanation was erroneous). - tye	[reply]
Re^4: Clarifying the Comma Operator (should) by CountZero (Bishop) on Jun 07, 2009 at 14:42 UTC
Re^3: Clarifying the Comma Operator (bareword) by JavaFan (Canon) on Jun 07, 2009 at 12:28 UTC
Actually, that's where the problem lies: 08 is not seen as a string but (wrongly) as an octal number. Perl tries to numify this "word" whereas it should be stringified. But how should perl know? perl is tokenizing when it encounters the leading 0. Considering that it know it's now expecting a TERM, the leading 0 must mean it's going to encounter an octal number. It doesn't know about the following arrow yet, but it has to decide how to tokenize the next thing. And that's why all "bare words" look like identifiers. When encountering `foo::bar => baz` [download] perl is expecting a TERM. Barewords can start terms (infix operators like the `x` tye mentions cannot - that's why infix operators can consist of letters, but prefix operators cannot), and that's why all barewords look like identifiers. Because only after tokenizing the bareword and looking at the next token, perl decides whether the bareword is an identifier (subroutine, filehandle), or a string.	[reply] [d/l]
Re^4: Clarifying the Comma Operator (bareword) by CountZero (Bishop) on Jun 07, 2009 at 17:22 UTC
Re: Clarifying the Comma Operator by graff (Chancellor) on Jun 07, 2009 at 02:22 UTC
In response to your suggestion: The "=>" operator is a synonym for the comma, but forces any word (beginning with an underscore or alphabetic character and consisting entirely of word characters) to its left to be interpreted as a string (as of 5.001). This includes words that might otherwise be considered a constant or function call but does not include numeric literals. I think saying it like this would make it easier to understand: The "=>" operator is a synonym for the comma, but forces any word to its left to be interpreted as a string (as of 5.001); this includes words that might otherwise be considered a constant or function call. Here, a "word" is anything beginning with an underscore, hyphen or alphabetic letter and consisting entirely of word characters (see description of "\w" in perlre). Any "word" that can be interpreted as a number will be converted to a string after evaluating its numeric value (e.g. "1.20" and "-034" to the left of "=>" will yield the strings "1.2" and ~~"-34"~~ "-28", respectively -- the latter involves an octal-to-decimal conversion). UPDATE: Thanks to Porculus for pointing out the error with -034. As for what CountZero said, I think a negative number is consistent with the definition, so the example is relevant and useful. While "1.20" in fact fails to match the definition, I thought it was better to include it as an example anyway, to demonstrate the effect -- as well as the fact that stepping outside the definition in this way does not cause a compile-time error. (And I just noticed what happens when you do: `%h = ( 7-6 => "one", 1+1 => "two" )`, which some might consider to be a useful feature, but would need to be used with care...)	[reply] [d/l]
Re^2: Clarifying the Comma Operator by CountZero (Bishop) on Jun 07, 2009 at 07:06 UTC
Your number examples do not fall within the definition of "word" as they do not begin with an underscore, hyphen or alphabetic letter. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re^2: Clarifying the Comma Operator by Porculus (Hermit) on Jun 07, 2009 at 11:14 UTC
(e.g. "1.20" and "-034" to the left of "=>" will yield the strings "1.2" and "-34", respectively). Are you certain about this? Have you tested your expectation? (Looking pointedly at "-034"...)	[reply]
Re^2: Clarifying the Comma Operator by JavaFan (Canon) on Jun 07, 2009 at 12:13 UTC
Any "word" that can be interpreted as a number will be converted to a string after evaluating its numeric value (e.g. "1.20" and "-034" to the left of "=>" will yield the strings "1.2" and "-34", respectively). No, it doesn't. `perl -MDevel::Peek -we '@f = (1.20 => "1.2"); Dump $f[0]; Dump $f[1]' SV = NV(0x88238e8) at 0x87fc230 REFCNT = 1 FLAGS = (NOK,pNOK) NV = 1.2 SV = PV(0x87f9044) at 0x87fc424 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x8806fc8 "1.2"\0 CUR = 3 LEN = 4` [download] As you can see, the 1.20 is not a string (there's no PV value, nor is POK or pPOK set), while "1.2" is a string (and not a number).	[reply] [d/l]
Re^3: Clarifying the Comma Operator by graff (Chancellor) on Jun 07, 2009 at 19:09 UTC
In `( 1.20 => 'foo' )` I'm not saying that the "1.20" is a string. The point is that in order to build a string from a "word" like 1.20 that sits to the left of "=>", perl determines its numeric value (NV), which it then "stringifies" to the simplest possible form ("1.2") -- and this is what the OP was struggling with.	[reply] [d/l]
Re: Clarifying the Comma Operator by JavaFan (Canon) on Jun 07, 2009 at 00:23 UTC
Actually, I didn't suggest identifier should replace word. I said any that looks like an identifier. Note that perldata in its second paragraph already defines how identifiers look like. but I worry that identifier might have limitations that would not be appropriate. Don't. Think about it. When parsing, on encountering a possible 'bare word', the parser doesn't look ahead and say "hmmm, let me see, if the token following this bare word thingy isn't a fat comma, I'm going to parse it as an identifier, because it's got to be a subroutine or a file/dir handle, but if it isn't followed by a fat comma, I'm going to accept something slightly different". No, the parser is going to parse something that may be a valid identifier. Then it looks whether it's in autoquoting context (next token is a fat comma, or we're indexing in a hash and the next token is a close brace), in which case the thing just parsed is a string, otherwise, it is indeed an identifier (subroutine, file/dir handle).	[reply]
Re^2: Clarifying the Comma Operator by ig (Vicar) on Jun 07, 2009 at 01:41 UTC
My apologies for expressing my idea as yours, and I didn't even fully appreciate your earlier comment. With your clarification, rethinking and rereading perldata, I think I get it. I am now inclined to replace word with word that looks like an identifier. I think this is very consistent with what perl actually does and provides a reasonably clear and concise description. As for the bit in parentheses: it would then be redundant with the definition of identifier in perldata. So the question is: what's worse: a redundant statement or making users jump between documents?	[reply]
Re: Clarifying the Comma Operator by Bloodnok (Vicar) on Jun 07, 2009 at 00:10 UTC
I'm definitely in favour of your suggested re-wording since the confusion in the post you identify strongly suggests that further clarification is necessar. Altho', having said that and in a similar vein to you, I'm struggling to find suitable wordage that's both fully descriptive and comprehensible to non-native English speakers. A user level that continues to overstate my experience :-))	[reply]
Re: Clarifying the Comma Operator by duelafn (Parson) on Jun 07, 2009 at 11:48 UTC
Seems like an explicit pattern is easy enough: The "=>" operator is a synonym for the comma, but forces any word (anything matching /^\w+$/ that is not a numeric literal) to its left to be interpreted as a string (as of 5.001). This includes words that might otherwise be considered a constant or function call. Good Day, Dean	[reply]
Re^2: Clarifying the Comma Operator by zwon (Abbot) on Jun 07, 2009 at 22:32 UTC
`08z` isn't numeric literal, but: `$ perl -E'say 08z => 1' Illegal octal digit '8' at -e line 1, at end of line syntax error at -e line 1, near "08z" Execution of -e aborted due to compilation errors.` [download]	[reply] [d/l] [select]
Re: Clarifying the Comma Operator by Marshall (Canon) on Jun 06, 2009 at 23:43 UTC
I tried to update my post in previous thread, but there were some problems. I'm not sure presentation is correct and for some reason the server won't let me edit my post yet again. I don't know why. I probably generated something un-parseable for the markup language by pure accident. Anyway this wording, "The "=>" operator is a synonym for the comma, but* forces any word...* would appear to me to mean that somehow => and "," are different? I don't think so. Maybe and is the right conjunction?	[reply]
Re^2: Clarifying the Comma Operator by ig (Vicar) on Jun 07, 2009 at 00:12 UTC
This is an interesting point. I looked at toke.c in the perl source and found that OPERATOR(',') is generated for both "," and "=>". I jumped to the conclusion that they must be identical. But then I tried the following: `use strict; use warnings; use Data::Dumper; use constant FOO => "something"; my %h = ( FOO => 23, FOO , 24 ); print Dumper(\%h); __END__ $VAR1 = { 'FOO' => 23, 'something' => 24 };` [download] So they are in fact different, despite the same token being returned from toke.c. Something higher up must look again at what the input text was to decide what to do, but I don't know where this happens. update: The distinction is in toke.c, not when scanning "," or "=>" but when scanning a word. After scanning a word, toke.c looks ahead and, if it sees "=>" it returns TERM(WORD).	[reply] [d/l]
Re^3: Clarifying the Comma Operator by Marshall (Canon) on Jun 07, 2009 at 00:45 UTC
Yes, this is interesting as "=>" and "," are not interchangeable. The net of this is that this question is one heck of a lot more complex than it sounded at first! If thing X is described as a "delta" from thing Y ("but"). Is Y well defined? And is it possible to clearly understand what X will do even given that you understand Y? (e.g what the "but" delta means). As another thought, if defining the complete behavior is too complex, is there some subset that can be described that would be generally useful?	[reply]
Re^3: Clarifying the Comma Operator by lakshmananindia (Chaplain) on Jun 08, 2009 at 05:23 UTC
Sorry for interrupting, May I know what is meant by TERM? --Lakshmanan G. The great pleasure in my life is doing what people say you cannot do.	[reply]
Re^4: Clarifying the Comma Operator by ig (Vicar) on Jun 08, 2009 at 05:41 UTC
Re^2: Clarifying the Comma Operator by JavaFan (Canon) on Jun 07, 2009 at 00:08 UTC
Well, `=>` and `,` are different. The difference is exactly the but part.	[reply] [d/l] [select]
Re: Clarifying the Comma Operator by Anonymous Monk on Jun 07, 2009 at 00:12 UTC
I think examples would be good `C:\>perl -le " print for w0rd => _w0rd => -w0rd => 07 => 1 " w0rd _w0rd -w0rd 7 1 C:\>perl -le " print for 0word => 1 " syntax error at -e line 1, near "0word" Execution of -e aborted due to compilation errors. C:\>perl -le " print for 08 => 1 " Illegal octal digit '8' at -e line 1, at end of line Execution of -e aborted due to compilation errors.` [download]	[reply] [d/l]
Re: Clarifying the Comma Operator by shmem (Chancellor) on Jun 07, 2009 at 21:47 UTC
I'd just say: When in doubt, quote.	[reply]