in reply to Keeping lookahead assertion from looking to the end of the string?

I am not sure that you are making this harder than it is. If you always have an end marker following a start marker than it is trivial. If on the other hand, you may see something like:

"start b2 start b2 end start b2 b2 end"

then the only way I see to do it would be to use functions like index and substr to iterate over the string removing pieces of it until it is in the form you want before regex'ing it.

If on the slight chance you will always see a start marker, some text that does NOT include another start marker, and then an end marker, you can simply do this:

my $str = "start b2 end start b2 b2 end start b2 end"; if ($str =~ /.*start((.*?b2.*?b2).*?)end/) { print "$1 is between \"start\" and \"end\"\n"; }
  • Comment on Re: Keeping lookahead assertion from looking to the end of the string?
  • Download Code

Replies are listed 'Best First'.
Re: Keeping lookahead assertion from looking to the end of the string?
by swackerl (Initiate) on Sep 05, 2002 at 18:22 UTC
    Limbic~Region, thanks and I think that the solution you provided will work the best. I don't know why I didn't think of using the ".*" at the beginning of the string before to make it match minimally over "start .. end". Thanks for all of the help!
      After taking another look, I think that the original reason why I wanted to use lookahead expressions was that the "start ... end" strings may be nested, as in the example provided by Limbic~Region. It looks like I'll need to use looping and string manipulation in place of a single regular expression. *sigh*
Re: Re: Keeping lookahead assertion from looking to the end of the string?
by jsprat (Curate) on Sep 05, 2002 at 22:06 UTC
    Doesn't always work:

    my $str = "start b2 end start b2 end start b2 end"; #only one b2 between each start and end if ($str =~ /.*start((.*?b2.*?b2).*?)end/) { print "$1\n"; } __END__ Output: b2 end start b2
    shouldn't match when there is only one 'b2' between start and end, but it does.

    IMO, the most robust solution is to use a parser (Parse::RecDescent) or multiple regexes.