in reply to Parse::RecDescent: problem with grammar and error reporting

So, you basically want to allow a @ as long as it's not preceeded by whitespace? \b will not do that -- that just forces a word character to be present.

Suggestion (untried):

text: /[^@\n]*(?:\S@[^@\n]*)*/
That still allows text like:
foo bar@
I cannot deduce whether you want to allow that or not.

It's not the most efficient regex, as it typically will backtrack one character on each (valid) @ character encountered. But since Parse::RecDescent is a massive backtracking engine written in Perl, this is likely to be acceptable.

Replies are listed 'Best First'.
Re^2: Parse::RecDescent: problem with grammar and error reporting
by kikuchiyo (Hermit) on Jan 20, 2012 at 11:15 UTC

    The suggestion doesn't work, it fails at "Even more test address@test.com @20". I guess it's because \S is not a zero-width assertion, it actually wants a non-whitespace character, but those were already gobbled up by [^@\n]*.

    "foo bar@" should be allowed.

    Interestingly, this version does print the error messages, but I don't understand why.

      Ah, it's because if one has /PAT1*PAT2*/ Perl prefers to match as many PAT1s as possible, even to the extend of matching less in total. Witness the difference:
      $ perl -wE 'q{Even more test address@test.com @20} =~ /[^@\n]*(?:\S@[^ +@\n]*)*/ and say $&' Even more test address $ perl -wE 'q{Even more test address@test.com @20} =~ /[^@\n]*(?:\S@[^ +@\n]*)+/ and say $&' Even more test address@test.com
      Try this:
      /[^\S\n]*(?:[^\s@]+@?[^\S\n]*)*/

        Nice.

        Only problem with it is that it matches the empty line, which it shouldn't do. This way every line ends up on one unterminated slide, so the parsing fails at the end.