in reply to Re: Re: Re: Another regexp question
in thread Another regexp question

Ah.. thankyou.

So what I am reading between the lines here is the OR ability in regex.. which I iether never knew, or knew and forgot.

So the next question (and you dont have to answer it cos just pondering outloud) is: where does the OR finish and the next thing begin?

$string = 'this is a bing'; # sample strings.. $string = 'bing is my name'; $string = 'cows go bonging'; $string = 'cows go bang99'; $string =~ m/^bing|bong|bang\d\d/;
Would I need to put ^ infront of each OR case of I want them to match at the beginning of the line only?
Similarly if I want all to only match if the end with \d\d do I include it at the end or in each case?
How does it know the end of the start of the first OR case and the end of the last OR case?
___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: Another regexp question
by Roger (Parson) on Nov 21, 2003 at 03:53 UTC
    Hi wolis, you have to be more specific with what you are searching for in your regular expression.

    Your regular expression will look for ^bing (bing at the start), bong and bang\d\d (anywhere on the line). If you want to search for all these at the beginning of the line, you can add brackets - ( ... ) -
    m/^(?:bing|bong|bang\d\d)/
    Note that I also added '?:' in front of the patterns to tell Perl not to capture any matches for just a bit faster.

Re: Re: Re: Re: Re: Another regexp question
by davido (Cardinal) on Nov 21, 2003 at 04:01 UTC
    The regex "or" (Alternation, it's called) has fairly low precedence. That means that the ^ binds more closely than the |. The result is that you've got this going on:

    m/ ^bing | bong | bang\d\d /x;

    I used an "extended regular expression" so that I could group each subexpression (each alternate) on its own line. If you want the ^ to bind to all three, and the \d\d to bind to all three, you must use parenthesis to constrain the alternation. And if you aren't trying to capture, use non-capturing parens:

    m/^b(?:i|o|a)ng\d\d/;

    (Note: I factored out everything that is common to all three alternates. That step is unnecessary. You could use (?:bing|bong|bang) too.)

    Alternation may be the best route to follow. But sometimes when you see it factored down as the previous example, you might suddenly realize, hey, I can do this with a character class too:

    m/^b[ioa]ng\d\d/;


    Dave


    "If I had my life to live over again, I'd be a plumber." -- Albert Einstein
      Thankyou,

      you explained that so very clearly!

      I find the Perl pod fun and useful and a great read but its little things like "which binds more strongly ^ or |?" that I seem to only find out from useful people in PerlMonks.

      ___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com