nr0mx has asked for the wisdom of the Perl Monks concerning the following question:

Does qr// not work when used with () as the delimiters ?

Specifically, I am talking about the code pasted below. The perlop entry talks about qr() in the same breath as q(),qq(),qx(),etc and Programming Perl has this to say about them -
"Some of these are simply forms of "syntactic sugar" to let you avoid putting too many backslashes into quoted strings. Any non-alphanumeric, non-whitespace delimiter can be used in place of /. If the delimiters are single quotes, no variable interpolation is done on the pattern. If the opening delimiter is a parenthesis, bracket, brace, or angle bracket, the closing delimiter will be the matching construct."

Am I missing something here ?

use strict; use diagnostics; use Getopt::Long; my ( $timing ) = 0 ; my $regexp = qr(\s+execution\s+time\s+\(\s+millis\s+\)\s+\:\s+); while(<DATA>) { chomp; do { ( $timing ) = /$regexp(\d+)/; print "$timing "; }if /$regexp/; } print "Done!\n"; __DATA__ PPP execution time ( millis ) : 1062 PPP execution time ( millis ) : 6460 PPP execution time ( millis ) : 7570 PPP execution time ( millis ) : 38646 PPP execution time ( millis ) : 45658 PPP execution time ( millis ) : 577

Replies are listed 'Best First'.
Re: qr delimiter question
by Jenda (Abbot) on Nov 17, 2003 at 16:27 UTC

    The problem is that the regexp contains a \( and \). Perl thinks the backslashes are there to escape the braces while parsing the qr() construct and "removes" them from the regexp. Compare:

    perl -MO=Deparse -e "$regexp = qr(\s+execution\s+time\s+\(\s+millis\s+ +\)\s+\:\s+);" and perl -MO=Deparse -e "$regexp = qr/\s+execution\s+time\s+\(\s+millis\s+ +\)\s+\:\s+/;"

    Jenda
    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature

Re: qr delimiter question
by Anonymous Monk on Nov 17, 2003 at 16:25 UTC

    One point of alternate delimiters is to choose something you aren't trying to match as well. In your case, escaping the parens inside the regex is escaping their delimiting meaning, not their regex meaning. In other words, your embedded parens become simple capturing parens, not literal parens. Choose a different delimiter such as qr< ...>.

Re: qr delimiter question
by broquaint (Abbot) on Nov 17, 2003 at 16:29 UTC
    The problem is you're escaping the delimiters, so they're being seen as capturing parens as opposed to literal parens e.g
    print qr(\s+execution\s+time\s+\(\s+millis\s+\)\s+\:\s+); __output__ (?-xism:\s+execution\s+time\s+(\s+millis\s+)\s+:\s+)
    A simple solution would is use an alternate quoting delimiter like square brackets or simpler still don't use a regex at all e.g
    print join ' ', map { chomp; (split)[-1] } grep /\s+\d+$/, <DATA>; print " Done!\n"
    HTH

    _________
    broquaint

Re: qr delimiter question
by EvdB (Deacon) on Nov 17, 2003 at 16:26 UTC
    If the opening delimiter is a parenthesis, bracket, brace, or angle bracket, the closing delimiter will be the matching construct.

    As you have a closing bracket in your regex you can not use it as a delimiter.

    --tidiness is the memory loss of environmental mnemonics

Re: qr delimiter question
by ysth (Canon) on Nov 17, 2003 at 17:27 UTC
    Can anybody explain why it does work if you make it: qr(\s+execution\s+time\s+[(]\s+millis\s+[)]\s+\:\s+) It seems inconsistent, somehow. It seems to me that either the above should require backslashes on the () to compile at all or the backslashes in the original example should make them match literal () characters.

    can somebody point out my point of confusion on this point?

      Paired delimiters can nest. Hence: my $exp = qr(x(capture)y); is valid. Perl recognizes the parens inside the expression as being inside the expression. In the example you gave, there is a ( inside the expression before the ) inside the expression, so Perl understood it as a nested pair.

      You can see how perl is parsing the regex by printing it:

      >perl -e "print qr(a(b)c)" (?-xism:a(b)c) >perl -e "print qr(a\(b\)c)" (?-xism:a(b)c) >perl -e "print qr(a\\(b\\)c)" (?-xism:a\\(b\\)c)
      As you see, a single backslash is interpreted as escaping the parens from the qr(), rather than as part of the regex. A double-backslash escapes the backslash in the regex, so there's no backslashy way to match a literal paren. You have to use a character class, plus a backslash, unless you're going to have a balancing paren in the expression, in which case, perl will notice the balanced parens and Do The Right Thing.
      >perl -e "print qr(a\(bc)" Unmatched ( in regex; marked by <-- HERE in m/a( <-- HERE bc/ at -e li +ne 1. >perl -e "print qr(a[(]bc)" Search pattern not terminated at -e line 1. >perl -e "print qr(a[\(]bc)" (?-xism:a[(]bc)