bedanta has asked for the wisdom of the Perl Monks concerning the following question:

I have to do a pattern matching against a string, let say for the following pattern:

tom and (dick or harry or john)

I'm trying with the following string:

/tom&&(dick||harry||john)/

which doesn't work, can you suggest me an answer, thanks

Replies are listed 'Best First'.
Re: Pattern matching
by Joost (Canon) on Jan 05, 2005 at 12:34 UTC
    tom and (dick or harry or john)
    I'll assume you want to match a string containing "tom" and one (or more) of "dick","harry" or "john" in any order.

    This can't be done easily in one regular expression, but you can use two (or more):

    /tom/ && /dick|harry|john/;
    or
    /tom/ && ( /dick/ || /harry/ || /john/ );
    Note that && and || are perl operators, which don't have a corresponding meaning inside a regex. (You can use | in a regex, but there is no & operator, because regexes already match the whole expression by default)

Re: Pattern matching
by Hena (Friar) on Jan 05, 2005 at 12:51 UTC
    If you mean line 'tom blah blah jonh' should match kind of match (eg blah blah can be anything) then this should do it.
    m/(?<=tom).*(dick|harry|john)/
    But i would think Joost had more readable suggestions.

      That doesn't work. Since you don't capture .*, your look-behind is identical to m/tom.*(dick|harry|john)/. You also can't put a regexp inside a look-behind, IIRC.

      ' tom harry ' =~ m/(?<=tom).*(dick|harry|john)/ && print(1, $/); # + 1 ' harry tom ' =~ m/(?<=tom).*(dick|harry|john)/ && print(2, $/); # + (nothing)

      Lookahead works, though

      ' tom harry ' =~ m/(?=.*tom).*(dick|harry|john)/ && print('A', $/); + # A ' harry tom ' =~ m/(?=.*tom).*(dick|harry|john)/ && print('B', $/); + # B
        True. Didn't realize that you could get both sides with lookahead. But I was just trying to get the A part working :).
      what is the significance of "?<=" in your code?
        look at perldoc perlre:

        "(?<=pattern)"<br /> A zero-width positive look-behind assertion. For example, "/(?<=\t)\w+/" matches a word that follows a tab, without including the tab in $&. Works only for fixed-width look-behind.
Re: Pattern matching
by prasadbabu (Prior) on Jan 05, 2005 at 12:35 UTC

    first you go throuh the perldoc for regular expressions.

    $a=~ /(tom)(dick|harry|john)/;

    Prasad

      $a=~ /(tom)(dick|harry|john)/;

      I really doubt that the OP wanted to match "tomdick", "tomharry" and "tomjohn". Shouldn't there be some provision for whitespace in the string too? ;) The other problem with that solution is, what if the string contains "john tom". I think the OP wants to match even if the two names are in different order. ...as long as tom and any of the other three names exist. That's why the two regexp approach works so elegantly.


      Dave

      I assumed the string as tom and any one of the other (dick,harry,john).

      Prasad

Re: Pattern matching
by CountZero (Bishop) on Jan 05, 2005 at 16:55 UTC
    Do not confuse pattern matching with logical (boolean) constructs.

    If you want to match this pattern (and only this one) tom and (dick or harry or john), the regex to use is
    m/^tom and (dick or harry or john)$/.

    If you want to match a pattern which consists of an arbitrary string containing "tom" and (one or more of) "dick", "harry" or "john", with "tom" coming earlier in the string than "dick", "harry" or "john", you could think of using
    m/\b(tom)\b(.*?)\b(dick|harry|john)\b/

    The matter becomes all of a sudden much more difficult if the "or" is to be understood as "xor". This however is left as an excercise for the readers.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law