John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

After a discussion earlier on composing regular expressions, I've learned that piecing them together causes the whole thing to be recompiled anyway, so it's simpler to just put together a string than to avoid having incomplete regex parts.

So I'm looking at rewriting all the regex parts as plain strings. In my experiment, I changed a qr to a q and added a (?: ) around everything. I know the modifiers that will be used on the ultimate one that ties them all together, so I don't have to make them self-contained. I know how to stick it in the opening grouping syntax if I have to.

Now clearly using the single-q form of a string won't take $var things within it. But other than that, what exactly do I need to watch out for? What's different between the parsed single-quote-string literal and the qr literal?

I've also verified that simply not having the "self-containment" modifiers (xs-im) doesn't make it generate any different code. I'm supposing that these affect compile-time interpretation of the contents, and would not "do" anything with respect to making it generate worse code than simply having the grouping construct alone (?: ) would do. Furthermore, superfluous non-capturing parens shouldn't hurt it anyway since the optimizer blows past them, right? (But I'm sure there are cases where the groupers are not redundant and I can tell it matches the same or at least solves the same problem without them, but is technically different so the optimizer can't help there.)

I will note, though, that writing component qr's helps with error checking and debugging when writing the thing, since it will show the syntax error in a small section. But I can change to q's once it works, so it doesn't compile everything twice.

  • Comment on differences between parsing qr, qq, and q string contents

Replies are listed 'Best First'.
Re: differences between parsing qr, qq, and q string contents
by JavaFan (Canon) on May 01, 2011 at 14:18 UTC
    Did you read the section Gory details of parsing quoted constructs in perlop?
Re: differences between parsing qr, qq, and q string contents
by wind (Priest) on May 01, 2011 at 17:06 UTC

    In my opinion, I think this is mostly a non-issue where premature optimization is actually costing you.

    If you have a group of regex's that you're putting together, it is better to use qr{} instead of using q{} or qq{}. This is because you want to validate each subexpression before it's put into your final regex. Getting an error message early on is worth the cost of having to compile the subexpressions once each.

    # Validate these subexpressions before putting in main regex. my $pre_re = qr{blah}; my $body_re = qr{foobar}; my $post_re = qr{bizbaz}; while ($data =~ /$pre_re($body_re)$post_re/) {

    The one time when it is helpful to know that you don't need to use qr, is when you're programatically building the regex, like when you're doing a list.

    my @array = qw(foo bar biz); # Want to match any in the list; my $list = join '|', map {quotemeta} @array; while ($data =~ /($list)/) {
      If you have a group of regex's that you're putting together, it is better to use qr{} instead of using q{} or qq{}. This is because you want to validate each subexpression before it's put into your final regex.
      But that limits you to using parts that are full patterns themselves.
      Getting an error message early on is worth the cost of having to compile the subexpressions once each.
      So, it's ok to pay a price each and every time code is compiled, just so *debugging* during development may become easier? It also means you're paying the price if you do know your patterns, and write them correct the first time.

      It's not a trade-off I will always make in the same direction. In fact, I'd usually make it in the opposite direction than you suggest should always be taken.

        So, it's ok to pay a price each and every time code is compiled, just so *debugging* during development may become easier? It also means you're paying the price if you do know your patterns, and write them correct the first time.

        It's a very small price to pay.

        And it's not a price at all if you're using mod_perl and so compile time is not a consideration.

        And if it is a price, then just change your qr{}'s to qq{}'s and done.

      A list, optional parts, and referring to (?&name) parts that will be in the final expression are all problems with composing it as individual (working) qr's.

      I have all of those.