in reply to regular expression-xerox

Character classes like [ABCD] can be converted to alternation as follows:
my $class = 'ABCD'; my $xerox = join '|', split //, $class; # create alternation $xerox = '(?:' . $xerox . ')'; # non-capturing grouping

-Mark

Replies are listed 'Best First'.
Re: regular expression-xerox
by Abigail-II (Bishop) on May 03, 2004 at 21:09 UTC
    Not quite. You haven't considered:
    1. Characters that have a special meaning, like -, ^, and ] (and that meaning is position dependent!)
    2. Characters that inside a character class don't have a special meaning, but have one outside the class, like +, ?, * and others.
    3. POSIX character class syntax.

    Abigail

      My solution is correctly answers the particular requirement the OP stated: convert the character class 'ABCD' to a form that uses alternation. If one extrapolates that requirement to all alphanumerics, them my type of solution still works.

      If one exptrapolates to metacharacters like those in 1. and 2., or to predefined POSIX classes or Unicode characters, as in 3., then obviously the parser and translator must be extended to handle these situations.

      But for the simple requirements stated by the OP, a simple solution is best.

      -Mark

Re: Re: regular expression-xerox
by oz (Novice) on May 04, 2004 at 10:14 UTC
    May I ask what does ? mean in the regular expression. I can not use ? in the language I am translating since it has already a meaning- which is any character. One other question can't it be done with a substitution routine because I need to globally change each occurence of [] to | in the regular expression. And one note my character classes include only capital letters as the simplest example i give. thanks to everyone offering help:)