in reply to •Re: How are regex character classes implemented?
in thread How are regex character classes implemented?

The program I'm discussing was "prototyped" in Perl to transform a native file into XML. Now that feature is being integrated into the main program, which is written in C++. I don't need general parsing/matching features, just a way to tell whether a string is a legal XML identifier.

Too bad specifications like that don't list Unicode glyph database properties, rather than all the legal characters individually!

Doing a good job of that is low priority, but interesting to me.

I think last time I looked at PCRE (if it's the same library I saw before), it didn't handle Unicode. The one you point to mentions screwed-up experimental UTF-8 features, so maybe it's evolved.

Thanks, as always.

—John

  • Comment on Re: •Re: How are regex character classes implemented?