in reply to Re: Regular Expressions
in thread Regular Expressions

Given the code:
#!/usr/bin/perl use strict; use warnings; my $test_string = "This is not a waffle"; print "Match: $1\n" if $test_string =~ /(?:f)\1/;
I get the following error (Win2K, perl v5.8.6):
Reference to nonexistent group in regex; marked by <-- HERE in m/(?:f)\1 <-- HERE / at C:\Documents\foo.pl line 8. (where line 8 is the last line in the snippet)

Note that this is actually referring to the lack of capturing parens. If I change the last line to
print "Match: $1\n" if $test_string =~ /((?:f)\1)/;
there are no errors but also no match. Interesting, if you ask me.

Replies are listed 'Best First'.
Re^3: Regular Expressions
by chas (Priest) on May 17, 2005 at 21:21 UTC
    In the first case, there is nothing stored in \1 so there shouldn't be any match (and it is an error.) In the latter case, \1 probably refers to the outer parentheses, but that hasn't taken effect yet, so there is no match and no error (perhaps it really doesn't make sense but perl tries to interpret it that way...)
    If you instead do
    print "Match: $1\n" if $test_string =~ /((?:f))\1/;
    the result is: Match: f
    (Doing that doesn't seem especially useful, though...)
    chas
      Allow me to clarify a bit.

      What I find interesting is that for the first case perl doesn't warn about it at all*. If I have a regex that uses backreferences but has no capturing parens, I'd expect that perl (with use warnings enabled) would mention something about it.

      I'm not saying it's wrong behavior, and I understand why it's not matching. It just didn't provide the warning I'd expect.

      *update: Or put it another way. I mean that the error message "Reference to nonexistent group in regex;" goes away if the regex is changed slightly. Consider:
      print "matches\n" if $test_string =~ /(?:f)\1(.)/;
      There is a valid set of capturing parens there, but after the \1 backreference is used. So this won't match -- and Perl doesn't warn about it. Surprised me a bit.

        Yes, it is a bit surprising/confusing that there isn't a warning. I was just trying to guess why there isn't an error message ("nonexistent group") as in the previous case, and maybe it is for the reason I stated (i.e. \1 refers to the outer parens);I'm really not convinced, though.
        chas
Re^3: Regular Expressions
by mrborisguy (Hermit) on May 17, 2005 at 20:31 UTC
    hmm... interesting. thanks for tryin' that our for me!
Re^3: Regular Expressions
by maard (Pilgrim) on May 18, 2005 at 08:31 UTC
    perl -e '$_="aaaa"; print "Match: $1\n" if /(?:(a)\1)/;'

    prints 'Match: a'
    because you asked perl to grab letter 'a' and then another 'a' after this.

    perl -e '$_="aaaa"; print "Match: $1\n" if /((?:a)\1)/;'
    prints nothing because at the point of \1 there must be already captured group, which isn't a case, because \1 is prefixed by non-capturing parens.

    As for absence of warning in ((?:a)\1) - maybe regexp engine only issues warning at compile time if there're no grouping parens in re at all. (maybe it just doesn't perform complicated compile-time checking of regexp to see if grouping parens really come before \1).