if( $stdout =~ /(?:OVERVIEW|SUMMARY)(.+)(?:AFFECTED\sPRODUCTS|BACKGROU +ND)/s ) { print "$1\n"; }

You said you want to pick out everything between "OVERVIEW" and "AFFECTED PRODUCTS", so why does your regex also look for "SUMMARY" and "BACKGROUND"?

Here's what's happening: Your first grouping looks for OVERVIEW or SUMMARY. Your last grouping looks for AFFECTED PRODUCTS or BACKGROUND. Between those, your capture of (.+) is greedy, so it will try to match as much as possible. So your first grouping matches OVERVIEW, then your capture greedily matches all the way to the end of the string, then starts working backwards until it finds a point where your last grouping can match. Since BACKGROUND comes later in the string than AFFECTED PRODUCTS, BACKGROUND gets matched.

To put it another way, by matching BACKGROUND instead of AFFECTED PRODUCTS, the regex is able to give the longest possible string to your greedy capture in the middle. To fix it, the short answer is to make that captured match non-greedy by changing it to (.+?). A better answer would require understanding why you have those other words in there if you don't need to match them.

Aaron B.
Available for small or large Perl jobs; see my home node.


In reply to Re: Regex problem by aaron_baugher
in thread Regex problem by jayto

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.