in reply to Reusing a complex regexp in multiple spots, escaping the regexp

Use qr// to precompile a regex into a form that can be used instead of a regex on the right-hand-side of =~ and also interpolated into other regexen. Use named capture groups and %+ instead of $1..$9. Use (DEFINE) to reduce repitition within a single regex.

You didn't provide any sample input to really write a test program against, but here's a quick-and-dirty attempt to rewrite your regex that at least compiles. Quite a few more simplifications are probably possible, this is just to get you started.

my $regex = qr{ (?<intplus> (?&INT) (?&SOMETHING)+ (?&QUOT)? ) | (?<intonly> (?&INT) (?&QUOT)? ) | (?<regonly> \b(?:E?A[XHL])=[0-9A-Fa-f]{2,}[Hh]? (?&SOMETHING)* (?&QUOT)? ) | (?<table> \#[0-9A-Z][0-9]{4}\b) | (?<mem_16_16> \bMEM\s?(?&HEX4):(?&HEX4) (?&QUOT)? ) | (?<mem_32> \bMEM\s?[0-9A-Fa-fXx]{1,8}[Hh]? (?&QUOT)? ) | (?<call> \@(?&HEX4):(?&HEX4) (?&QUOT)? ) | (?<portrange> \bPORT\s?(?&HEX4)-(?&HEX4) (?&QUOT)? ) | (?<portsingle> \bPORT\s?(?&HEX4) (?&QUOT)? ) (?(DEFINE) (?<INT> \bINT\s?[0-9A-Fa-f]{2}[Hh]? ) (?<HEX4> [0-9A-Fa-fXx]{1,4}[Hh]? ) (?<QUOT> (?:\"[^"]+\") ) (?<SOMETHING> \/(?:E?[ABCD][XHL]|E?[SD]I|E?[SB]P|[DESC]S)=[0-9A-Fa-f]{2, +}[Hh]? ) ) }x;

Replies are listed 'Best First'.
Re^2: Reusing a complex regexp in multiple spots, escaping the regexp
by LanX (Saint) on Apr 12, 2026 at 22:40 UTC
    On a tangent, what is the benefit of (?(DEFINE) ...) constructs for repeated patterns here?

    Intuitively, I would have opted for interpolating nested $variables holding qr// snippets, especially since I can make them more readable with /x and can unit test them individually.

    But I'm curious to learn why you chose this way.

    Is it about the handling of capture groups?

    I tried to read the relevant docs, but they constantly mention recursion and I can't spot any here...

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery