The problem is that the final glyphs are hard-coded. Although it might look like you're providing instructions to dynamically generate the glyphs, you're really only indicating which hard-coded glyphs to use.

The following two glyphs can only be selected as a single entity; however MAN-GIRL-BOY is hard-coded but MAN-BOY-GIRL is not.

👨‍👧‍👦

$ uparse 👨‍👧‍👦

============================================================
String: '👨‍👧‍👦'
============================================================
👨      U+1F468  MAN
        U+200D   ZERO WIDTH JOINER
👧      U+1F467  GIRL
        U+200D   ZERO WIDTH JOINER
👦      U+1F466  BOY
------------------------------------------------------------

👨‍👦‍👧

$ uparse 👨‍👦‍👧

============================================================
String: '👨‍👦‍👧'
============================================================
👨      U+1F468  MAN
        U+200D   ZERO WIDTH JOINER
👦      U+1F466  BOY
        U+200D   ZERO WIDTH JOINER
👧      U+1F467  GIRL
------------------------------------------------------------

In PowerShell:

PS C:\Users\ken> $joiner = [char]::ConvertFromUtf32(0x200D) PS C:\Users\ken> $man = [char]::ConvertFromUtf32(0x1F468) PS C:\Users\ken> $girl = [char]::ConvertFromUtf32(0x1F467) PS C:\Users\ken> $boy = [char]::ConvertFromUtf32(0x1F466)
PS C:\Users\ken> "$man$joiner$girl$joiner$boy"
👨‍👧‍👦
PS C:\Users\ken> "$man$joiner$boy$joiner$girl"
👨‍👦‍👧

Notice how the BOY remains on the lower-right regardless of whether the GIRL is included or not. The MAN-BOY is a preset glyph; adding GIRL after that just adds another glyph (even if the ZWJ combines these two glyphs into a singly-selectable unit).

Hopefully, at some future point, sequences of glyphs, joiners, modifiers, and so on, will act as an instruction to dynamically generate a new glyph.

— Ken


In reply to Re^4: uparse - Parse Unicode strings by kcott
in thread uparse - Parse Unicode strings by kcott

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.