My module is creating "User-Defined Character Properties" for unicode characters in regular expressions (I think they're called character classes, but please correct me if I've misunderstood). I would like to add a function for showing the characters included in each defined group. One such group, in the package file, might look like this:
sub InThaiHCons { #Thai high-class consonants return join "\n", '0E02', #KHO KHAI '0E03', #KHO KHUAT '0E09', #CHO CHING '0E10', #THO THAN '0E16', #THO THUNG '0E1C', #PHO PHUNG '0E1D', #FO FA '0E28', #SO SALA '0E29', #SO RUSI '0E2A', #SO SUA '0E2B', #HO HIP }
...or like this:
sub InThaiLCons { #Thai low-class consonants return <<'END'; 0E04 0E07 0E0A 0E0D 0E11 0E13 0E17 0E19 0E1E 0E27 0E2C 0E2E END }
How could the calling (main) program be provided a list of each codepoint associated with that particular character class?

For example, I would like something like this...

my @characters = list('InThaiHCons'); #OR PERHAPS my @characters = MyModule::list('InThaiHCons'); #IF FUNCTION IS NOT E +XPORTED print @characters; #[The site would not print the UTF-8 characters her +e--see below the code box.] #Codepoints, e.g. '0E01' instead of actual characters would also be ac +ceptable, as it should be trivial to convert them.

#ขฃฉฐถผฝศษสห

Is it possible to create a function that would provide such a list as this without having to duplicate the list in the module? Alternatively, is there any function by which such character classes can be spelled out already, e.g. is there a way to query what /\p{IsThai}/ includes?

Note that I have read documentation on the subject and found the following, but do not understand it, nor does it seem to do quite what I'm needing.

https://perldoc.perl.org/Unicode::UCD#prop_invlist%28%29

Blessings,

~Polyglot~


In reply to Listing out the characters included in a character class by Polyglot

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.