If your file is too large to slurp, and performance is an issue--when is it ever not :)-- you might consider this technique.

Demo

#! perl -slw use strict; $|++; # Disable buffering on STDOUT for demo my $re_multi = qr[ # Match 3 lines ^name\s*\n # First starts with "name" (and maybe some whit +espace? ^-+\n # Second consists entirely of '-'s (^.+)\n # Third has the stuff we want so capture to $1 ]mx; # Allow match across line boundries. # Ignore incidental white space my $buffer = ''; # Init our buffer to null # Grab a managable chunk of data. # 16 is silly for demo only. # My test show that 32/64 k seems about right on my system. # The length() call ensures residual is retained while (sysread( DATA, $buffer, 16, length $buffer)) { # find all occurances in buffer. while($buffer =~ m[$re_multi]g ) { print $1; # Do something with them # stop the buffer grwing bigger than necessary # by discarding everything upto the end of the last match $buffer = substr($buffer, pos($buffer) ); } # This line defends against the buffer growing very large if two o +ccurances # of the pattern are a very long way apart in the file. # It works (in the case) by discarding stuff that that cannot be p +art # of what we are looking for (and could probably be improved upon) +. # This must be tailored on a case-by-case basis. $buffer = substr( $buffer, pos($buffer) -1 ) if $buffer =~ m[\n(?! +name)]gc; } __DATA__ Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and +irrelevent crap name ---------------------------------------- 1 23 4.5 678e9 Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and +irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and +irrelevent crap name ---------------------------------------- 1 23 4.5 678e9 Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and +irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and +irrelevent crap name ---------------------------------------- 1 23 4.5 678e9 Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and irrelevent crap Loads'a junk a garbage and +irrelevent crap

Output

D:\Perl\test>258244 1 23 4.5 678e9 1 23 4.5 678e9 1 23 4.5 678e9

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

In reply to Re: RegEx on more than one line by BrowserUk
in thread RegEx on more than one line by jmaya

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.