in reply to Re: How do I avoid regex engine bumping along inside an atomic pattern?
in thread How do I avoid regex engine bumping along inside an atomic pattern?

Hi, If there was no x in the comment, the second test would still fail because there is no ' x' to match (note the blank space before the x). But I agree with you that I am trying to do too much with regular expression. I believe I can do the following:
my $c = qr/(?>\s|--[^\n]*(?:\n|\z))/; # one whitespace or one comm +ent # later on, when parsing... pos($str) = 0; if ($str =~ m/a/gc) { print "found a\n" } else { print "missing a\n" } $str =~ m/$c*/gc; # skip any comments, whitespaces if ($str =~ m/x/gc) { print "found x\n" } else { print "missing x\n" }
I am not sure if I need set pos($str) to 0 at the beginning. And I am not sure if I need to use \G when parsing.

But again, thanks for your ideas!

Replies are listed 'Best First'.
Re^3: How do I avoid regex engine bumping along inside an atomic pattern?
by tilly (Archbishop) on Aug 24, 2008 at 18:54 UTC
    You don't need to set pos($str) to 0 at the beginning - it is automatically undef which does the same thing. However you do need to reset it after every failed match before you try to match again.

    But you do need to use \G or else you get your original problem. Using a \G at the start of your RE says, "Does this match right where I left off?" Leaving it out means, "Search from where I left off to find where it matches." So the latter will search ahead and find matches inside comments. The former can have the logic to know whether it is inside a comment or not. The latter does not.

    About the second test, I suspect you didn't say exactly what you meant to say in the original question...