in reply to Re^6: Reusing a complex regexp in multiple spots, escaping the regexp
in thread Reusing a complex regexp in multiple spots, escaping the regexp

I'm still not convinced that nesting qr// snippets into larger qr// parts is such a show stopper.

Are they stored as nested objects causing overhead? (Hmm probably... I read something in the docs about them acting like closures.)

But what if I stringify/flatten the outcome and recompile it, even with a /o once modifier? This should reset most effects.

Unless context switches like different nested modifiers like /i cause new overhead...(?)

DB<1> p qr/[0-9a-f]/i (?^ui:[0-9a-f]) DB<2> $x=qr/[0-9a-f]/i DB<3> $hex_range= qr/$x+ - $x+/x DB<4> say $hex_range (?^ux:(?^ui:[0-9a-f])+ - (?^ui:[0-9a-f])+) DB<5>

I'll put this on my to-do list, and will try to dive deeper next weekend.

Like running benchmarks and looking into the op-tree.

Cheers Rolf
(addicted to the Perl Programming Language :)
see Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^8: Reusing a complex regexp in multiple spots, escaping the regexp
by tobyink (Canon) on Apr 16, 2026 at 13:00 UTC

    Pretty sure they're not kept as nested objects. I believe when the larger regexp is being compiled, any nested regexps are simply stringified and interpolated into it.

    In the original post that could be a problem as the regexp would need to be compiled each time around the while loop. As long as you're compiling it outside the loop, I don't see it being much of an issue.

      > Pretty sure they're not kept as nested objects.

      That was my first intuition, and looks like we were right, the final program is the same for the flattened string.

      $ perl -Mre=debug -E'$x=qr/[X0-9a-f]/i; $hr=qr/$x+ - $x+/x;' Compiling REx "[X0-9a-f]" Final program: 1: ANYOF[0-9A-FXa-fx] (11) 11: END (0) stclass ANYOF[0-9A-FXa-fx] minlen 1 Compiling REx "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" Final program: 1: PLUS (12) 2: ANYOF[0-9A-FXa-fx] (0) 12: EXACT <-> (14) 14: PLUS (25) 15: ANYOF[0-9A-FXa-fx] (0) 25: END (0) floating "-" at 1..9223372036854775807 (checking floating) stclass ANY +OF[0-9A-FXa-fx] plus minlen 3 Freeing REx: "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" Freeing REx: "[X0-9a-f]" $

      $ perl -Mre=debug -E'$hr=qr/(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+/x;' + Compiling REx "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" + Final program: 1: PLUS (12) 2: ANYOF[0-9A-FXa-fx] (0) 12: EXACT <-> (14) 14: PLUS (25) 15: ANYOF[0-9A-FXa-fx] (0) 25: END (0) floating "-" at 1..9223372036854775807 (checking floating) stclass ANY +OF[0-9A-FXa-fx] plus minlen 3 Freeing REx: "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" $

      > the regexp would need to be compiled each time around the while loop

      I think that's what /o was invented for

      Update

      So... Consequently any performance penalty coming with qr// could be mitigated by flattening it first and using the resulting string with /o modifier. 🤔

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

        I think that's what /o was invented for

        If a regexp contains something interpolated, I don't know if /o can do much for it. Consider:

        my $foo = qr/foo/; while ( get_line($fh) =~ m/($foo)(bar)/ ) { ...; $foo = qr/FOO/; # $foo gets changed }

        Even if $foo doesn't change, the fact that it has the potential to change makes any attempt at avoiding recompiling the larger regexp broken.

        perldoc perlre does note that /o is pretty broken.