in reply to String order in regex match - left to right, or right to left?

The difference is that you have to remember that in:
my $ua_name = $1 if ($ua=~ m/(opera|netscape|gecko|msie)/i);
The whole regex is checked at every position in your string before moving on to the next position. The whole regex is checked at the current position in the string, before moving on to the next position. So at every position in the string it would check first to see if it can match opera, failing that it THEN checks to see if it can match netscape, failing that it THEN checks to see if it can match gecko, and finally failing everything else it checks to see if it can match msie. If all this fails it moves to the next position in the string and checks again, and again and again until the regex matches. (remember left most first match wins)

In the second piece of code:

my $ua_name = $1 if ( ($ua=~ m/(opera)/i) || ($ua=~ m/(netscape)/i) || + ($ua=~ m/(gecko)/i) || ($ua=~ m/(msie)/i) );
What happens in this is that it will check the whole string and for something (ie. netscape, gecko, etc..) and then move on to the next regex if it fails. I can elaborate if need be.

update: perhaps an example would help better explain how the regex engine is working, at the command line try this one-liner:

perl -Mre=debug -le "$s = 'a man a plan a camel';print $1 if $s =~ /(c +amel|plan|monkey)/;"
Also I struck out the first sentence which was causing confusion, and replaced it with a new first sentence. Thanks Zero Flop for pointing out my poor wording. -enlil

Replies are listed 'Best First'.
Re: (2) String order in regex match - left to right, or right to left?
by Zero_Flop (Pilgrim) on Jun 07, 2003 at 02:56 UTC
    Enlil stated "The difference is that you have to remember that in:
    my $ua_name = $1 if ($ua=~ m/(opera|netscape|gecko|msie)/i);

    The whole regex is checked at every position in your string before moving on to the next position."

    I would also follow this train of thought but this actually does not explain what is happening. If this was happening Netscape would be found first, going left to right.

    Does it instead go right to left or is the comparison Alex posted different than what he is running, or is it something completely different?

    Thanks

      Gecko comes first in the string that is being searched. When the regex engine is trying to match a regex in a string, it starts at the first character of the string and tries to match from the beginning of the regex. If it can't match at that position, it moves to the next character of the string and starts trying to match again (unless you anchor the regex). In this case, the match keeps failing until it gets to the point in the string where 'Gecko' appears. At that point, the regex engine says to itself: "Can this match 'opera'? No. Can this match 'netscape'? No. Can this match 'gecko'? Yes. Return captured string 'gecko'. Done." Which happens before it gets to the point in the string where 'Netscape' appears.

      kelan


      Perl6 Grammar Student

        Thanks for clearing that up!

        So...

        It takes the first word of the string ("Mozilla" in this case) and compares it to each regex, opera then netscape then gecko ...

        This is reversed as to how I interprited it. I guess I think about it backwards, as taking opera and comparing that to each word in the string.
        Kelan

        Thank you very much, that does indeed explain it. I think Enlil tried to explain this to me as well, but I didn't get it. Thank you both for your help!

        Alexander Garcia

      I also have problems with this left/right thing :)

      left right | | Gecko/20011019 Netscape6/6.2