in reply to Re^2: uparse - Parse Unicode strings
in thread uparse - Parse Unicode strings

Maybe at some future point we can add the white hair to this family setting ... maybe you can already do this with your Win11 Segoe UI Emoji font. Can you?

You read me like a book, that's exactly what I was trying to do! :) ... and was bitterly disappointed when it didn't work.

For completeness, I ran a simple standalone test using Windows 11 PowerShell.

PS C:\> $joiner = [char]::ConvertFromUtf32(0x200D) PS C:\> $man = [char]::ConvertFromUtf32(0x1F468) PS C:\> $girl = [char]::ConvertFromUtf32(0x1F467) PS C:\> $boy = [char]::ConvertFromUtf32(0x1F466) PS C:\> $whitehair = [char]::ConvertFromUtf32(0x1F9B3)

PS C:\> "$man$joiner$girl$joiner$boy"
👨‍👧‍👦

PS C:\> "$man$joiner$whitehair$joiner$girl$joiner$boy"
👨‍🦳‍👧‍👦

Running equivalent test on Ubuntu bash with echo -e produced the same depressing result. It seems you can enjoy a family emoji with a default man, but not a man with white hair. Maybe a Unicode emoji expert knows how to do it, but I don't.

👁️🍾👍🦟

Replies are listed 'Best First'.
Re^4: uparse - Parse Unicode strings
by kcott (Archbishop) on Nov 19, 2023 at 11:05 UTC

    The problem is that the final glyphs are hard-coded. Although it might look like you're providing instructions to dynamically generate the glyphs, you're really only indicating which hard-coded glyphs to use.

    The following two glyphs can only be selected as a single entity; however MAN-GIRL-BOY is hard-coded but MAN-BOY-GIRL is not.

    👨‍👧‍👦

    $ uparse 👨‍👧‍👦
    
    ============================================================
    String: '👨‍👧‍👦'
    ============================================================
    👨      U+1F468  MAN
            U+200D   ZERO WIDTH JOINER
    👧      U+1F467  GIRL
            U+200D   ZERO WIDTH JOINER
    👦      U+1F466  BOY
    ------------------------------------------------------------
    

    👨‍👦‍👧

    $ uparse 👨‍👦‍👧
    
    ============================================================
    String: '👨‍👦‍👧'
    ============================================================
    👨      U+1F468  MAN
            U+200D   ZERO WIDTH JOINER
    👦      U+1F466  BOY
            U+200D   ZERO WIDTH JOINER
    👧      U+1F467  GIRL
    ------------------------------------------------------------
    

    In PowerShell:

    PS C:\Users\ken> $joiner = [char]::ConvertFromUtf32(0x200D) PS C:\Users\ken> $man = [char]::ConvertFromUtf32(0x1F468) PS C:\Users\ken> $girl = [char]::ConvertFromUtf32(0x1F467) PS C:\Users\ken> $boy = [char]::ConvertFromUtf32(0x1F466)
    PS C:\Users\ken> "$man$joiner$girl$joiner$boy"
    👨‍👧‍👦
    PS C:\Users\ken> "$man$joiner$boy$joiner$girl"
    👨‍👦‍👧
    

    Notice how the BOY remains on the lower-right regardless of whether the GIRL is included or not. The MAN-BOY is a preset glyph; adding GIRL after that just adds another glyph (even if the ZWJ combines these two glyphs into a singly-selectable unit).

    Hopefully, at some future point, sequences of glyphs, joiners, modifiers, and so on, will act as an instruction to dynamically generate a new glyph.

    — Ken

      Notice how the BOY remains on the lower-right regardless of whether the GIRL is included or not

      That's not what I see!

      Whichever of girl or boy appears first, is rendered on my lower-left as I look at the screen.

      PS C:\> "$man$joiner$girl$joiner$boy"
      👨‍👧‍👦
      PS C:\> "$man$joiner$boy$joiner$girl"
      👨‍👦‍👧
      

      In the first one above, the boy is on the lower-right; in the second one the boy is on the lower-left.

      If only one of boy or girl is included, they appear on the lower right as I look at the screen.

      PS C:\> "$man$joiner$boy"
      👨‍👦
      PS C:\> "$man$joiner$girl"
      👨‍👧
      

      Maybe it depends on the font?

      👁️🍾👍🦟
        "Maybe it depends on the font?"

        Probably the same font issue we've been encountering for some days.

        I see the same as you describe with a single child:

        PS C:\Users\ken> "$man$joiner$girl"
        👨‍👧
        PS C:\Users\ken> "$man$joiner$boy"
        👨‍👦
        

        — Ken