in reply to Re^3: uparse - Parse Unicode strings
in thread uparse - Parse Unicode strings

The problem is that the final glyphs are hard-coded. Although it might look like you're providing instructions to dynamically generate the glyphs, you're really only indicating which hard-coded glyphs to use.

The following two glyphs can only be selected as a single entity; however MAN-GIRL-BOY is hard-coded but MAN-BOY-GIRL is not.

👨‍👧‍👦

$ uparse 👨‍👧‍👦

============================================================
String: '👨‍👧‍👦'
============================================================
👨      U+1F468  MAN
        U+200D   ZERO WIDTH JOINER
👧      U+1F467  GIRL
        U+200D   ZERO WIDTH JOINER
👦      U+1F466  BOY
------------------------------------------------------------

👨‍👦‍👧

$ uparse 👨‍👦‍👧

============================================================
String: '👨‍👦‍👧'
============================================================
👨      U+1F468  MAN
        U+200D   ZERO WIDTH JOINER
👦      U+1F466  BOY
        U+200D   ZERO WIDTH JOINER
👧      U+1F467  GIRL
------------------------------------------------------------

In PowerShell:

PS C:\Users\ken> $joiner = [char]::ConvertFromUtf32(0x200D) PS C:\Users\ken> $man = [char]::ConvertFromUtf32(0x1F468) PS C:\Users\ken> $girl = [char]::ConvertFromUtf32(0x1F467) PS C:\Users\ken> $boy = [char]::ConvertFromUtf32(0x1F466)
PS C:\Users\ken> "$man$joiner$girl$joiner$boy"
👨‍👧‍👦
PS C:\Users\ken> "$man$joiner$boy$joiner$girl"
👨‍👦‍👧

Notice how the BOY remains on the lower-right regardless of whether the GIRL is included or not. The MAN-BOY is a preset glyph; adding GIRL after that just adds another glyph (even if the ZWJ combines these two glyphs into a singly-selectable unit).

Hopefully, at some future point, sequences of glyphs, joiners, modifiers, and so on, will act as an instruction to dynamically generate a new glyph.

— Ken

Replies are listed 'Best First'.
Re^5: uparse - Parse Unicode strings
by eyepopslikeamosquito (Archbishop) on Nov 19, 2023 at 12:11 UTC

    Notice how the BOY remains on the lower-right regardless of whether the GIRL is included or not

    That's not what I see!

    Whichever of girl or boy appears first, is rendered on my lower-left as I look at the screen.

    PS C:\> "$man$joiner$girl$joiner$boy"
    👨‍👧‍👦
    PS C:\> "$man$joiner$boy$joiner$girl"
    👨‍👦‍👧
    

    In the first one above, the boy is on the lower-right; in the second one the boy is on the lower-left.

    If only one of boy or girl is included, they appear on the lower right as I look at the screen.

    PS C:\> "$man$joiner$boy"
    👨‍👦
    PS C:\> "$man$joiner$girl"
    👨‍👧
    

    Maybe it depends on the font?

    👁️🍾👍🦟
      "Maybe it depends on the font?"

      Probably the same font issue we've been encountering for some days.

      I see the same as you describe with a single child:

      PS C:\Users\ken> "$man$joiner$girl"
      👨‍👧
      PS C:\Users\ken> "$man$joiner$boy"
      👨‍👦
      

      — Ken