in reply to Re^2: Unicode infinity
in thread Unicode infinity

'use utf8' just assumes utf8 encoding to convert input bytes to character values. It does not change perls syntax - and so does not change the parsing of literals, nor does it change what values are accepted as strings without quoting.

So basically what you are asking for is that perl treat a specific character value (0x221e) as a numerical infinity (in all cases, or only in literals, or only in non-quoted literals, or only in a non-quated literals of one character in length?).

Note that the above has nothing to do with 'utf8', and would obviously break any code anytime a character has a value of 0x221e.

I guess what I'm saying is: 'Inf' is a string that can be treated as special case in numerical context. An alternative using other unicode characters would still need to be string of more than 1 character, it can't just be a one character because every individual character is already mapped to a numerical number.

Why does it make more sense for the inf unicode symbol to be treated as numerical infinity instead of... its unicode value? And if we do, where do we stop? How many other symbols should be treated special values instead of their unicode values? Should we treat 0x03C0 as 3.14159... ? At what point is your request really just 'perl should accept unicode symbologies as syntax'?

(these are genuine questions - I find this very interesting, hopefully this is not coming across wrong :-)

Replies are listed 'Best First'.
Re^4: Unicode infinity
by NERDVANA (Priest) on Jul 01, 2024 at 23:53 UTC
    π can't be used as a number because it already parses as an identifier (greek letter) and you can already use it as a sub name.

    ∞ does not, however, and simply emits a syntax error. You can use it as the delimiter for quote constructs, but that only applies after the beginning token of a quote construct. I'm very narrowly talking about changing perl's failure mode when it encounters this character to do something useful, since infinity is really a character that unambiguously indicates a mathematical value that also has an unambiguous floating-point encoding.

    I'd be happy to see many more codepoints given language functionality, if they are unambiguous symbols that imply unambiguous scalar values. I can't name any others offhand. (Mathematicians really ought to stop stealing greek letters for things. I mean it's not like they were using mechanical tools with a limited range of symbols when they started this stuff... they could draw anythign they wanted. And, Greek is still an active language! not a dead one like Latin...)

      > π can't be used as a number because

      perl -Mutf8 -le 'use constant π => 3.141;print sqrt π'
      1.77228665852903
      
        Right, but that's using it as an identifier that returns a constant. In the parsing phase, it goes through "we have a word, what does it mean?" We can't define that as part of the language without breaking existing code, or going through a version-guard like with the new keywords in the builtin:: namespace. The infinity character is not a word character and can't be used as the name of a sub or variable or anything. (other than as the delimiter of a qq∞...∞, but my proposal doesn't break that) It also isn't stealing a useful character from someone's language.