in reply to Alternations and anchors
> Is there some weirdness with alternations and anchors?
It's a feature called "trie optimization"¹ which only works for literal characters and can't optimize meta symbols like anchors.
> (I could rewrite ^aaa|^bbb|^ccc as ^(aaa|bbb|ccc) , but it might be that only some of the inputs are anchored.)
So make two or-lists one with preceding anchor, one without and or them. Both will be trie-optimized and this should be max speed again.
Like:
((?:^(?:aaa|bbb|ccc))|(?:ddd|eee|ccc))
*untested!!!*
(?:...) is for grouping without matching capturing.
(Not sure if the one around the anchored part is necessary... I don't think so, just playing safe)
Cheers Rolf
(addicted to the Perl Programming Language :)
see Wikisyntax for the Monastery
¹) compare Re^2: Looking for a cleaner regex ( trie since 5.10 ! )
Tested in perldebugger
perl -de0 ... DB<11> @matches = ( "aaa x ddd y ccc" =~ /(^ (?:aaa|bbb|ccc) | (?:dd +d|eee|ccc) )/xg ); DB<12> print "@matches" aaa ddd ccc DB<13>
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Alternations and anchors
by jwkrahn (Abbot) on Mar 31, 2025 at 01:41 UTC | |
by LanX (Saint) on Mar 31, 2025 at 01:51 UTC | |
Re^2: Alternations and anchors (trie optimization)
by Chuma (Scribe) on Apr 02, 2025 at 16:33 UTC | |
by LanX (Saint) on Apr 02, 2025 at 16:50 UTC | |
by cavac (Prior) on Apr 03, 2025 at 11:44 UTC | |
by LanX (Saint) on Apr 03, 2025 at 12:09 UTC | |
by hippo (Archbishop) on Apr 03, 2025 at 13:07 UTC | |
| |
by cavac (Prior) on Apr 03, 2025 at 13:31 UTC | |
|