in reply to Composing regex's dynamically

$r = qr/this-stuff/x; $r = qr/this-stuff $more/x if $more;
where $more is either "| more-stuff", or "(*FAIL)". It doesn't matter whether $more is a plain string, or a $qr construct.

Now, I often assemble regexes from subparts. And I strongly prefer the subparts to be strings over compiled regexes. The compiled regexes contain extra sets of parens and setting of modifiers. I've made regexes long enough where the additional overhead of having all your subparts be qr// constructs made the difference between slow and "just takes too long".

You need a few backslashes less when using qr// instead of qq//. And there are some edge cases where the heuristic parsing of quoted constructs decides differently but they're obscure enough I can't even remember them.

Replies are listed 'Best First'.
Re^2: Composing regex's dynamically
by John M. Dlugosz (Monsignor) on Apr 28, 2011 at 20:57 UTC
    Interesting. Does the wrapping of a compiled regex with the remembered modifiers cost more than adding plain non-capturing parens that you would need anyway? Or in your situation did you not need parens at all?

    I wonder if composing actual pre-compiled regex's is any better than converting to string and re-compiling?

      I've never researched that, but if I make building blocks to assemble regexes, I typically surround that with (?:), unless it's not going to matter. So, I'd write:
      my $bblock1 = "[a-f]"; my $bblock2 = "foo"; my $bblock3 = "(?:$bar|$baz)";
      And then there's:
      my $bblock4 = "[a-i]"; my $bblock5 = "[A-I]"; my $bblock6 = "[0-9]"; my $bblock7 = "$bblock4|$bblock5|$bblock6"; $bblock7 =~ s/\Q]|[\E//g; # Make one range
        Why are the dollar signs doubled?

        I would not expect the -xism part to cost anything over the (?: ) without that. Again, that makes me wonder if re-compiling the string is worse than incorporating a compiled regex in another qr. Naturally, being able to leave off the parens completely simplifies things.

        The regex engine has improved its optimizations in successive versions of Perl too; I think I read something about that in either the 5.10 or 5.12 notes.