in reply to Re^5: Reusing a complex regexp in multiple spots, escaping the regexp
in thread Reusing a complex regexp in multiple spots, escaping the regexp

Note the (?(DEFINE)...) construct has it's own performance issue. From (DEFINE):
patterns defined this way probably will not be as efficient, as the optimizer is not very clever about handling them.
Also, all DEFINE'd patterns are capturing, even if you don't need it, so that is another minor hit against performance. This can also be confusing if you are trying to refer to capture groups by absolute number instead of by name.
  • Comment on Re^6: Reusing a complex regexp in multiple spots, escaping the regexp
  • Download Code

Replies are listed 'Best First'.
Re^7: Reusing a complex regexp in multiple spots, escaping the regexp
by LanX (Saint) on Apr 16, 2026 at 12:27 UTC
    I'm still not convinced that nesting qr// snippets into larger qr// parts is such a show stopper.

    Are they stored as nested objects causing overhead? (Hmm probably... I read something in the docs about them acting like closures.)

    But what if I stringify/flatten the outcome and recompile it, even with a /o once modifier? This should reset most effects.

    Unless context switches like different nested modifiers like /i cause new overhead...(?)

    DB<1> p qr/[0-9a-f]/i (?^ui:[0-9a-f]) DB<2> $x=qr/[0-9a-f]/i DB<3> $hex_range= qr/$x+ - $x+/x DB<4> say $hex_range (?^ux:(?^ui:[0-9a-f])+ - (?^ui:[0-9a-f])+) DB<5>

    I'll put this on my to-do list, and will try to dive deeper next weekend.

    Like running benchmarks and looking into the op-tree.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      Pretty sure they're not kept as nested objects. I believe when the larger regexp is being compiled, any nested regexps are simply stringified and interpolated into it.

      In the original post that could be a problem as the regexp would need to be compiled each time around the while loop. As long as you're compiling it outside the loop, I don't see it being much of an issue.

        > Pretty sure they're not kept as nested objects.

        That was my first intuition, and looks like we were right, the final program is the same for the flattened string.

        $ perl -Mre=debug -E'$x=qr/[X0-9a-f]/i; $hr=qr/$x+ - $x+/x;' Compiling REx "[X0-9a-f]" Final program: 1: ANYOF[0-9A-FXa-fx] (11) 11: END (0) stclass ANYOF[0-9A-FXa-fx] minlen 1 Compiling REx "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" Final program: 1: PLUS (12) 2: ANYOF[0-9A-FXa-fx] (0) 12: EXACT <-> (14) 14: PLUS (25) 15: ANYOF[0-9A-FXa-fx] (0) 25: END (0) floating "-" at 1..9223372036854775807 (checking floating) stclass ANY +OF[0-9A-FXa-fx] plus minlen 3 Freeing REx: "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" Freeing REx: "[X0-9a-f]" $

        $ perl -Mre=debug -E'$hr=qr/(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+/x;' + Compiling REx "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" + Final program: 1: PLUS (12) 2: ANYOF[0-9A-FXa-fx] (0) 12: EXACT <-> (14) 14: PLUS (25) 15: ANYOF[0-9A-FXa-fx] (0) 25: END (0) floating "-" at 1..9223372036854775807 (checking floating) stclass ANY +OF[0-9A-FXa-fx] plus minlen 3 Freeing REx: "(?^ui:[X0-9a-f])+ - (?^ui:[X0-9a-f])+" $

        > the regexp would need to be compiled each time around the while loop

        I think that's what /o was invented for

        Update

        So... Consequently any performance penalty coming with qr// could be mitigated by flattening it first and using the resulting string with /o modifier. 🤔

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery