in reply to Re^3: uparse - Parse Unicode strings
in thread uparse - Parse Unicode strings
The problem is that the final glyphs are hard-coded. Although it might look like you're providing instructions to dynamically generate the glyphs, you're really only indicating which hard-coded glyphs to use.
The following two glyphs can only be selected as a single entity; however MAN-GIRL-BOY is hard-coded but MAN-BOY-GIRL is not.
👨👧👦
$ uparse 👨👧👦 ============================================================ String: '👨👧👦' ============================================================ 👨 U+1F468 MAN U+200D ZERO WIDTH JOINER 👧 U+1F467 GIRL U+200D ZERO WIDTH JOINER 👦 U+1F466 BOY ------------------------------------------------------------
👨👦👧
$ uparse 👨👦👧 ============================================================ String: '👨👦👧' ============================================================ 👨 U+1F468 MAN U+200D ZERO WIDTH JOINER 👦 U+1F466 BOY U+200D ZERO WIDTH JOINER 👧 U+1F467 GIRL ------------------------------------------------------------
In PowerShell:
PS C:\Users\ken> $joiner = [char]::ConvertFromUtf32(0x200D) PS C:\Users\ken> $man = [char]::ConvertFromUtf32(0x1F468) PS C:\Users\ken> $girl = [char]::ConvertFromUtf32(0x1F467) PS C:\Users\ken> $boy = [char]::ConvertFromUtf32(0x1F466)
PS C:\Users\ken> "$man$joiner$girl$joiner$boy" 👨👧👦 PS C:\Users\ken> "$man$joiner$boy$joiner$girl" 👨👦👧
Notice how the BOY remains on the lower-right regardless of whether the GIRL is included or not. The MAN-BOY is a preset glyph; adding GIRL after that just adds another glyph (even if the ZWJ combines these two glyphs into a singly-selectable unit).
Hopefully, at some future point, sequences of glyphs, joiners, modifiers, and so on, will act as an instruction to dynamically generate a new glyph.
— Ken
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^5: uparse - Parse Unicode strings
by eyepopslikeamosquito (Archbishop) on Nov 19, 2023 at 12:11 UTC | |
by kcott (Archbishop) on Nov 19, 2023 at 12:56 UTC |