Sameet has asked for the wisdom of the Perl Monks concerning the following question:

Dear all,
I have the following question
i have a , i am using regular expressions to find the patterns in the string. Now there are actually 3 patterns that i want to find, which are in order, but each may be repeated number of times. Is there a way in Perl Regular Expression to get all the occurences of the pattern. E.g. my string is something like this:
abcdefghijklfghijklabcdefg
Now to find each occurence of 'e' followed by 'fg', i need to write a regular expression. So far i have tried 'e.*fg', but that picks only one occurence.
Is there a way out.
Regards
Sameet

Replies are listed 'Best First'.
Re: Regarding Regular Expressions
by borisz (Canon) on Aug 17, 2004 at 11:12 UTC
    $string = 'abcdefghijklfghijklabcdefg'; $c = 0; while ( ++$c && $string =~ m/(e(.*?fg){$c})/ ) { print "$1\n"; } __OUTPUT__ efg efghijklfg efghijklfghijklabcdefg
    Boris
Re: Regarding Regular Expressions
by ccn (Vicar) on Aug 17, 2004 at 10:54 UTC

    if( $string =~ /e/g ) { my $start_pos = pos($string) - 1; while ($string =~ /fg/g) { print substr($string, $start_pos, pos($string) - $start_pos) , " +\n"; } } # outputs efg efghijklfg efghijklfghijklabcdefg

    Or if you want to catch 'e.*gh' for every 'e':

    $string = 'abcdefghijklfghijklabcdefg'; while( $string =~ /e/g ) { my $start_pos = pos($string) - 1; while ($string =~ /fg/g) { print substr($string, $start_pos, pos($string) - $start_pos) , " +\n"; } pos($string) = $start_pos + 1; } # outputs efg efghijklfg efghijklfghijklabcdefg efg

    see perldoc perlretut, perldoc -f pos

      hi ccn,
      Thanks a ton. I got the solution.

      Regards
      Sameet
Re: Regarding Regular Expressions
by Random_Walk (Prior) on Aug 17, 2004 at 11:01 UTC
    It depends on what you mean by 'e' followed by 'fg'.

    your string: abcdefghijklfghijklabcdefg

    Here is one example s/e.*fg/eXfg/;
    abcdeXfg
    As you see it took e, followed by anything, followed by fg and is greedy.

    This is a little more like I think you want s/efg/eXfg/g;
    abcdeXfghijklfghijklabcdeXfg
    This looks for efg, the g modifier tells it to keep looking (global)

    or perhaps you meant this s/e[^e]*fg/eXfg/g;
    abcdeXfghijklabcdeXfg
    This is e followed by zero or more non e characters followed by fg, searched for multiple times.

    here is a generic search for p1 followed by p2, by p3, with things that are not the patterns in between. s/p1[^(p1)(p2)(p3)]*p2[^(p1)(p2)(p3)]*p3/X/g;
    then I guess you just count the X's or use them as markers for further processing or whatever else you needed.

    cheers.

Re: Regarding Regular Expressions
by PerlingTheUK (Hermit) on Aug 17, 2004 at 10:52 UTC
    This should do it:
    $mystr =~ s/whatever/e.*fg/g;
    the g looks for all occurrences.
    yet if you replace the first occurence the result would be
    mystr: abcdwhateverhijklfghijklabcdwhatever
    which produces a new fg. If this is not what you want Id suggest:
    $mystr =~ s/whatever/e.*fg/g while ( $mystr =~ /e.*fg/);
    Is there anyone around who can reduce this to only one regexp?
    The revolution of the world will not be stopped anytime soon!

      Solutions involving substitution are not what the OP wants. How about this:

      my @results = $mystr =~ /(efg)/g;
      Taken literally, "e followed by fg" is efg. If other characters may come between, try:
      my @results = $mystr =~ /(e.*?fg)/g;



      pbeckingham - typist, perishable vertebrate.
        Thats wrong, since the example has more 'fg' as 'e' so every hit starts at the first e and ends at x's 'fg'.
        $input = 'xxxxefghijklfghijklabcdefg'; __OUTPUT__ efg efghijklfg efghijklfghijklabcdefg
        Boris