So... I've been working on this for a couple days now and I'm getting nowhere. I'm a bit of a perl novice, but here's the situation. I'm trying to parse a massive scalar that contains a log file. I need to build an expression to extract the last set of data, which will be marked with a header and footer. I've figured out how to get there by converting the scalar to an array, going through it line by line until I find my matches and verify that they are the last ones etc... but it winds up taking a lot of processing time to do it that way. I'm hoping that the proper regular expression will be able to find my result without having to munch through the entire file so many times.

Here's the example code I've been working with to test my regular expression.

my $string = " start: end: test code start: real 1 end: real with start: real repeating newlines and more than start: real one instance end: real of the start: real desired string end: real start: end:"; if ($string =~/(start: real)((.|\n)*(?!start: )(.|\n)*)*(end: real)/){ print "$&"; }

The goal is it should match

start: real desired string end: real

and nothing else, but right now it matches from the first "start: real" to the "last end: real".

I can see how to make the expression non-greedy, which would make it possible to capture only the first set, but I don't know how to capture only the last one.


In reply to Regex to find the last matching set in a long scalar by superwombat

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.