in reply to How are regex character classes implemented?

Even if you give up Perl, you don't have to give up Perl-like regular expressions. It obviously can't be all of Perl's regular expressions (not without including a Perl interpreter), but it's a good approximation.

-- Randal L. Schwartz, Perl hacker

  • Comment on •Re: How are regex character classes implemented?

Replies are listed 'Best First'.
Re: •Re: How are regex character classes implemented?
by John M. Dlugosz (Monsignor) on Jul 19, 2002 at 18:02 UTC
    The program I'm discussing was "prototyped" in Perl to transform a native file into XML. Now that feature is being integrated into the main program, which is written in C++. I don't need general parsing/matching features, just a way to tell whether a string is a legal XML identifier.

    Too bad specifications like that don't list Unicode glyph database properties, rather than all the legal characters individually!

    Doing a good job of that is low priority, but interesting to me.

    I think last time I looked at PCRE (if it's the same library I saw before), it didn't handle Unicode. The one you point to mentions screwed-up experimental UTF-8 features, so maybe it's evolved.

    Thanks, as always.

    —John