in reply to Re: Matching against a partially known string
in thread Matching against a partially known string

(I am using terms "parser", "tokenizer / lexer", "abstract syntax tree (AST)" in a compiler context.)

I was pointed to the YAPE::Regex module "Yet Another Parser/Extractor for Regular Expressions". From the documentation and my experiments in the debugger, I believe it to be only a lexer, not a proper parser. As far as I can tell, it breaks down the text of a RE into strings of characters with particular meanings (tokens), but it doesn't assemble those meanings into a hierarchy (abstract syntax tree).

For a quick look at the AST of a RE, you can see the indented text in "Debugging regular expressions" of man perldebguts or

perl -Mre=debug -e '$re = qr/^a(b(cd?)?)?/;'
.

To do one of the transformations that many of us have thought up, I should be operating on the AST, not just on the individual tokens. Also, I would have to break each EXACT node into a sequence of single-character EXACT nodes.