in reply to Re^9: Parser Performance Question (Atomic grouping)
in thread Parser Performance Question

I had completly forgotten about the atomic match :). I reread the documentation (because I wasn't sure what it does exactly), and it shows that it is equivalent to the possessive quantifiers. So in the spirit of TIMTOWTDI:

use strict; use warnings; use feature 'say'; use Data::Dump qw( pp ); my @strs = qw( "..\\".. "abc" "a\"bc" "a\\\\bc" "a\" ); my %re = ( LanX => qr/ " (?> \\\\ | \\" | [^"] )* " /x, Eily => qr/ " (?: [^"\\] | \\. )* " /x, Poss => qr/ " (?: \\\\ | \\" | [^"] )*+ " /x, ); for my $str (@strs) { say "\nTesting: <$str> = ", pp ($str); $str =~ /$re{$_}/ and say "$_ found $&" or say "$_ found nothing" for keys %re; }

Replies are listed 'Best First'.
Re^11: Parser Performance Question (Atomic grouping)
by LanX (Saint) on Oct 06, 2017 at 13:40 UTC
    Great, discussion we (re)learned a lot! :)

    edit

    I saw atomic grouping discussed in Friedl's Book, but this use example is very instructive.

    And inhibiting backtracking has a great performance benefit, think I have to revisist some older projects of mine again.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

      Ah yes, I have a great potential for relearning :P

Re^11: Parser Performance Question (Atomic grouping)
by songmaster (Beadle) on Oct 20, 2017 at 01:37 UTC

    I appreciate reading the conversation you guys had, sorry I wasn't to able to take part. I'm now using a slightly modified version of Eily's regex (proven using the above framework and in my own tests):

    our $RXdqs = qr/ " (?> \\. | [^"\\] )* " /x;

    Note that all of my $RX... regex variables are used inside other regexes and surrounded on both sides by \s* and various specific characters like parentheses and commas (this is a parser for a formally defined syntax that proceeds through the input text serially, I'm not trying to find $needle inside some giant $haystack). I do have individual tests for these variables now, previously I was only testing the parser at a higher level.

    - Andrew