johngg has asked for the wisdom of the Perl Monks concerning the following question:

When I wrote this post last night I originally intended to add comments to the compiled regular expressions using extended syntax. However, when I did that the code suddenly started generating a "Use of uninitialized value in concatenation (.) or string" warning in the expression and I couldn't see why. I can now see what the problem is but I'm puzzled as to why it is happening. If I put a variable name in the comment the warning appeared and I had to escape the $ sigil to make the warning go away, which is a bit worrying in what is supposed to be of no significance to the compiler. Here are examples of the regex and what happens in each case.

Original regular expression, no warning generated

$rxNumberGrps = qr {(?x) \D* (\d+) (?=,|\)) (?{$deCommafied .= $^N}) (?: (??{$rxNumberGrps}) | \)\z ) };

With offending comment added, warning generated

$rxNumberGrps = qr {(?x) \D* (\d+) (?=,|\)) # append capture to $deCommafied (?{$deCommafied .= $^N}) (?: (??{$rxNumberGrps}) | \)\z ) };

escape sigil, no warning generated

$rxNumberGrps = qr {(?x) \D* (\d+) (?=,|\)) # append capture to \$deCommafied (?{$deCommafied .= $^N}) (?: (??{$rxNumberGrps}) | \)\z ) };

use (?# ... ) style comment, no warning generated

$rxNumberGrps = qr {(?x) \D* (\d+) (?=,|\)) (?# append capture to $deCommafied) (?{$deCommafied .= $^N}) (?: (??{$rxNumberGrps}) | \)\z ) };

Other than the warning being generated, there was no effect on the running of the script and the same output was produced each run. It looks to me like the extended syntax comments aren't stopping the interpolation of variables in some way or another. Is this some sort of bug or am I missing something?

Cheers,

JohnGG

Replies are listed 'Best First'.
Re: A variable name in a regex comment triggers a warning
by ysth (Canon) on Dec 24, 2006 at 22:39 UTC
    I actually would have expected it to interpolate as you describe :)

    If you have a newline in $deCommafied, there certainly will be an effect; anything after the newline isn't part of the comment:

    $ perl -wle'use re "debug"; $x = "foo\nbar"; qr/(?x)hullo #there $x/' Compiling REx `(?x)hullo #there foo bar' size 4 Got 36 bytes for offset annotations. first at 1 1: EXACT <hullobar>(4) 4: END(0) anchored `hullobar' at 0 (checking anchored isall) minlen 8 Offsets: [4] 5[20] 0[0] 0[0] 25[0] Freeing REx: `"(?x)hullo #there foo)\nbar"'

    FWIW, using qr{ ... }x; instead of qr{(?x) ... }; seems to do what you expect. But I'd try to avoid relying on that. Also watch out for having your end-quote character ('}' in your example) in the comment.

      Thank you for your reply. It raises a couple of questions in my mind. Firstly, why the difference between qr{(?x)...} and qr{...}x? I ran the following and, yes, it looks like there's no interpolation with qr{...}x

      $ perl -wle'use re "debug"; $x = "foo\nbar"; qr/hullo #there $x/x' Compiling REx `hullo #there $x' size 4 Got 36 bytes for offset annotations. first at 1 1: EXACT <hullo>(4) 4: END(0) anchored `hullo' at 0 (checking anchored isall) minlen 5 Offsets: [4] 1[15] 0[0] 0[0] 16[0] Name "main::x" used only once: possible typo at -e line 1. Freeing REx: `"hullo #there $x"'

      (Although I can't say that I really understand most of the re 'debug' output :-)

      Secondly, why would you want interpolation inside comments anyway? I can see that interpolating a newline into the comment would stop the comment after the newline but to my mind doing the interpolation in the first place is not what you might reasonably expect.

      BTW, initialising $deCommafied before the qr{...} also silences the warning for reasons I now understand.

      Cheers,

      JohnGG

        Firstly, why the difference between qr{(?x)...} and qr{...}x?

        I guess that has to do with the stages of regex compilation. After the closing token for qr{ is found, the outer flags are checked. If there's an x modifier for the whole regex, comments will be stripped. Then variable interpolation is done. After that, the pattern is compiled, and it's at this stage the internal starting (?x) is seen. Alas, variables are already interpolated, which in your case generated a warning. Comments are then stripped to the end of the pattern (or the next (?-x)modifier).

        Does that make sense? If what I describe is the case, it would answer your second question too.

        -shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: A variable name in a regex comment triggers a warning
by mrborisguy (Hermit) on Dec 24, 2006 at 22:39 UTC

    Inside the regex, '#' doesn't start a comment.

    print ("#7" =~ /#/);

        -Bryan

      Yes, it does, sometimes:
      $ perl -wle'print("%7" =~ /(?x)#/)' 1