in reply to Unicode infinity

If a string contains only letters, digits, and underscores without starting with a digit, you can omit quotes.

The string 'Inf' complies with that, so you can write my $x= Inf; instead of my $x= "Inf";

A string containing only a character with value of 0x221e does not meet the requirement, so you need to quote it.

There is a near zero chance of this being 'fixed' because perl code is not UTF-8 and its base rules around strings without quotes is not likely to ever include any character outside the base printable ASCII range of values 0x20 to 0x7E.

Replies are listed 'Best First'.
Re^2: Unicode infinity
by NERDVANA (Priest) on Jul 01, 2024 at 04:55 UTC
    If you declare 'use utf8' then yes the perl script is UTF-8.

    I don't want it to get parsed as a string, I want ∞ to be an official part of the numeric tokenizer that returns an NV float to perl internals. When printed, it would render as "Inf" because it's the actual floating-point infinity value.

    (but actually you pointed out a misconception I had. I was thinking the token Inf was parsed as a float by the parser. It was because I didn't bother to enable strict and warnings)

      'use utf8' just assumes utf8 encoding to convert input bytes to character values. It does not change perls syntax - and so does not change the parsing of literals, nor does it change what values are accepted as strings without quoting.

      So basically what you are asking for is that perl treat a specific character value (0x221e) as a numerical infinity (in all cases, or only in literals, or only in non-quoted literals, or only in a non-quated literals of one character in length?).

      Note that the above has nothing to do with 'utf8', and would obviously break any code anytime a character has a value of 0x221e.

      I guess what I'm saying is: 'Inf' is a string that can be treated as special case in numerical context. An alternative using other unicode characters would still need to be string of more than 1 character, it can't just be a one character because every individual character is already mapped to a numerical number.

      Why does it make more sense for the inf unicode symbol to be treated as numerical infinity instead of... its unicode value? And if we do, where do we stop? How many other symbols should be treated special values instead of their unicode values? Should we treat 0x03C0 as 3.14159... ? At what point is your request really just 'perl should accept unicode symbologies as syntax'?

      (these are genuine questions - I find this very interesting, hopefully this is not coming across wrong :-)

        π can't be used as a number because it already parses as an identifier (greek letter) and you can already use it as a sub name.

        ∞ does not, however, and simply emits a syntax error. You can use it as the delimiter for quote constructs, but that only applies after the beginning token of a quote construct. I'm very narrowly talking about changing perl's failure mode when it encounters this character to do something useful, since infinity is really a character that unambiguously indicates a mathematical value that also has an unambiguous floating-point encoding.

        I'd be happy to see many more codepoints given language functionality, if they are unambiguous symbols that imply unambiguous scalar values. I can't name any others offhand. (Mathematicians really ought to stop stealing greek letters for things. I mean it's not like they were using mechanical tools with a limited range of symbols when they started this stuff... they could draw anythign they wanted. And, Greek is still an active language! not a dead one like Latin...)

Re^2: Unicode infinity
by haj (Vicar) on Jul 01, 2024 at 06:51 UTC
    If a string contains only letters, digits, and underscores without starting with a digit, you can omit quotes.
    Not if you use strict; (as one always should) or any version declaration use 5.012 or newer which implies strict.