There are a number of replies to your question, proposing a variety of solutions to your problem, while sticking with variations on your original regular expression that attempt match everything, capturing different parts of the match with capture-parenthesis.

Eventually someone will hit on the right technique; one that isn't plagued by lazy regexp engines, greedy matching, etc. But there's another possiblity...

You could make it easier on yourself, not worrying about trying to match ^(.*?) nongreedily, or about the lazy engine, or about (.*?)$ slurping everything up. Do it like this:

my $pattern = "AB"; print "pattern is $pattern\n"; my ( $middle ) = $string =~ /($pattern+)/; my ( $start, $end ) = ( $`, $' ); #.... and so on....

You take a performance hit in all regexp's in the program for using $` and $', but as I understand it, introducing capturing parens also introduces a similar performance hit for the current regular expression. And in non-time-critical operations (anything outside of tight loops) you don't really need to worry about the performance anyway right? ...so just do it the easy way.

If it turns out that you can't live with the speed-efficiency hit taken by leaning toward programming-efficiency, you can dig into other solutions. But the fact is that $`, $', and $& are there to be used, as long as you understand the ramifications of their use. To my knowledge, their use isn't deprecated, and it would seem that newer releases of Perl have even taken steps to make the use of those special variables more speed-efficiency friendly.

When the solution becomes so tricky that a dozen followup posts are still debating how to accomplish it, I think it's time to implement Perl's credo: There is more than one way to do it. (Start looking for a simpler solution). To that end, give my example a try.

Hope this helps...

Dave

"If I had my life to do over again, I'd be a plumber." -- Albert Einstein


In reply to Re: Regex help by davido
in thread Regex help by gri6507

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.