It's usually cleaner and clearer just to have the one name for any given piece of functionality, however.

I agree with this; however, others before me have already given us all such synonyms as "InThai" and "IsThai". That being the case, others coming along may not know which form to use. Sigh. To my mind, "InThai" looks to represent a range, and "IsThai" represents a quality--but these do happen to both apply to the same codepoints in this case. The same is true, however, for all of my Thai character groupings--essentially anytime more than one character is involved. But because of this overlap, and because it boils down to mere semantics and what people will remember/opine/prefer, I think it best to create the secondary names across the board, for flexibility/compatibility, even for single-codepoint returns.

The Perl documents are poor in this respect, and do not clarify the distinctions among \p{Thai}, \p{InThai}, \p{IsThai}. An explanation is offered at this URL: https://www.regular-expressions.info/unicode.html, saying:

Not all Unicode regex engines use the same syntax to match Unicode blocks. Java, Ruby 2.0, and XRegExp use the \p{InBlock} syntax as listed above. .NET and XML use \p{IsBlock} instead. Perl and the JGsoft flavor support both notations. I recommend you use the “In” notation if your regex engine supports it. “In” can only be used for Unicode blocks, while “Is” can also be used for Unicode properties and scripts, depending on the regular expression flavor you’re using. By using “In”, it’s obvious you’re matching a block and not a similarly named property or script.

Blessings,

~Polyglot~


In reply to Re^4: Listing out the characters included in a character class by Polyglot
in thread Listing out the characters included in a character class by Polyglot

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.