in reply to Re^2: Reusing a complex regexp in multiple spots, escaping the regexp
in thread Reusing a complex regexp in multiple spots, escaping the regexp

I used your suggestions as the base for my change. The occasional H was missing, and I added some /i flags to the numeric patterns. And renamed to keep things closer to what I had/want.

I was surprised that I had to escape the slashes in a comment of the big regexp, else they were misdetected as ending the regexp.

my $pathex2 = qr/[0-9A-F]{2}/i; # exactly 2 hex digits my $pathex2p = qr/[0-9A-F]{2,}/i; # 2 or more hex digits my $pathexX1t4= qr/[0-9A-FX]{1,4}/i; # 1-4 hex/X digits my $pathexX1t8= qr/[0-9A-FX]{1,8}/i; # 1-8 hex/X digits my $patopt_h = qr/h?/i; # optional trailing H/h my $patoptname= qr/(?:\"[^\"]+\")?/; # optional quoted literal (nam +e) my $patint = qr/\bINT\s?$pathex2$patopt_h/; # INT <byte> my $patmem_16_16 = qr/\bMEM\s?$pathexX1t4$patopt_h:$pathexX1t4$patopt_h/; # MEM <a +ddr>:<addr> my $patmem_32 = qr/\bMEM\s?$pathexX1t8$patopt_h/; # MEM <add +r> my $patportrange = qr/ \bPORT\s?$pathexX1t4$patopt_h\-$pathexX1t4$patopt_h/x; # PORT < +range> my $patportsingle= qr/\bPORT\s?$pathexX1t4$patopt_h/; # PORT <s +ingle> my $pattable = qr/\#[0-9A-Z][0-9]{4}\b/; # #<letter><4 +digits> my $pataccureg = qr/\bE?A[XHL]=$pathex2p$patopt_h/; # EAX/AH/ +AL assignment my $patregequ = qr/ (?:E?[ABCD][XHL]|E?[SD]I|E?[SB]P|[DESC]S)=$pathex2p$patopt_h /x; # a single register entry my $patreglists = qr/ (?:\/$patregequ)+ /x; # one or more "/<reg>" entries our $hyperlinkpattern = qr/ # 1. INT with register list and optional name (?<intplus> $patint $patreglists $patoptname ) | # 2. INT with only optional name (?<intonly> $patint $patoptname ) | # 3. EAX\/AH\/AL equation with optional register list and optional + name (?<regonly> $pataccureg (?:\/$patregequ)* $patoptname ) | # 4. Table reference (?<table> $pattable ) | # 5. MEM far address (addr:addr) with optional name (?<mem_16_16> $patmem_16_16 $patoptname ) | # 6. MEM single address with optional name (?<mem_32> $patmem_32 $patoptname ) | # 7. @ reference (addr:addr) with optional name (?<call> \@$pathexX1t4$patopt_h:$pathexX1t4$patopt_h $patoptname ) | # 8. PORT range with optional name (?<portrange> $patportrange $patoptname ) | # 9. PORT single value with optional name (?<portsingle> $patportsingle $patoptname ) /x;
  • Comment on Re^3: Reusing a complex regexp in multiple spots, escaping the regexp
  • Download Code

Replies are listed 'Best First'.
Re^4: Reusing a complex regexp in multiple spots, escaping the regexp
by LanX (Saint) on Apr 13, 2026 at 22:29 UTC
    > I added some /i flags to the numeric patterns.

    please note you can also use qr//x to make the sub-terms more readable

    > I was surprised that I had to escape the slashes

    Well Quote-like-Operators can chose the delimiter freely.

    like qr~~ or qr{}

    > I used your suggestions

    Well ... mainly AI suggestions ;-)

    Some were good, others not to my taste.

    TIMTOWTDI ...

    Here my take

    { my $HEX = qr/ [0-9A-F] /xi; my $HEX_X = qr/ [0-9A-FX] /xi; my $H_opt = qr/ [h]? /xi; my $INT = qr/ \b INT \s? $HEX{2} [hH]? /x; # is longer $H_opt + better ??? our $hyperlinkpattern = qr~ ... $INT ... ~x; }

    Please note how the helper variables are restricted to the scope and how $HEX{2} is NOT interpreted as a hash-lookup (surprised me!).

    Hope my suggestions helped you having better maintainable code :-)

    There are many more improvements which come to mind but I'm prone to over-engineering and it's in you're in a better position to decide what works best for you =)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery