In that case, I think a regex like this would work as the definition for rawtext: m{(?:[^<]|<(?!(?:/[a-zA-Z]+|[a-zA-Z]+/?)>))+} I did a few quick tests of this regex with demo_simpleXML.pl, and it worked as intended.
(The redundant [a-zA-Z]+ could be eliminated using the (?(condition)...) regex feature, added in perl5.005: m{(?:[^<]|<(?!(/)?[a-zA-Z]+(?(1)|/?)>))+} If the (/) matches, then (?(1)|/?) will match the null string; if the (/) does not match, then (?(1)|/?) will match /?. So, / can be at the beginning or the end, but not both. )
In reply to Re: Parse::RecDescent Woes
by chipmunk
in thread Parse::RecDescent Woes
by beppu
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |