in reply to Parse::RecDescent Grammar Fun

Your problem is the greedyness of punct. If it's ok that !@#$ will be parsed as four tokens, you should be able to get away with:
punct: /[^\w\s]/ {print "<punct: $item[1]>" }
although I have not tested it.

Otherwise, you may want to try something like (also untested):

punct: /(?:[^\w\s[]+|\[[^\w\s[]*|\[\[(?!.+?]]))+/ {print "<punct: $item[1]>"}

Note that your link rule consumes "[[]]]" completely, as a leading [[, a ] as the content, and a trailing ]]. Similary, "[[]] foo [[]]" is consumed by the link rule completely, with "]] foo [[" as the part inside the 'link'.

Abigail

Replies are listed 'Best First'.
Re: Re: Parse::RecDescent Grammar Fun
by ichimunki (Priest) on Jul 25, 2002 at 14:21 UTC
    Yes, I think letting punct be single character tokens is the way to go here. That regex is probably intuitive to some, but it looks like a maintenance nightmare.

    Thanks for the info on the link rule. I had completely forgotten to stop to think about what all that would match.