mrras25:

If you really want to step back a few lines, then you can just keep a buffer of the last few lines read. However, I'd suggest just parsing out the elements as you find them, and then insert them when you determine they're "interesting". If you find that it's not an interesting record, clear your list of elements and keep on going. Something like this1:

my %largerHash; # Place to accumulate interesting records my %elements; # Place to accumulate data into records while (<INF>) { if (/^(yabba|dabba|doo)\s+(.*)/) { # We only care about some of the fields $elements{$1} = $2; } elsif (/End of record ID:\s+(.*)/) { if ($1 =~ /foo/) { # Interesting record (starts with foo) so, store it $largerHash{$1} = %elements; } # Since we found end of record, clear our workspace %elements={}; } } __DATA__ Record 1 scooby 7 dooby 8 yabba Fred dabba Wilma End of record ID: cupcake Record 2 doo not fold spindle staple or mutilate dabba Barney yabba Dino End of record ID: foobar

In this example, we collect a couple of fields in record 1, but at the end of the record, we find that nothing was interesting, so we discard the elements we collected. Then we collect more items and at the end of the record, we find that it's interesting, so we add the elements to the larger hash that you want to process after parsing the data.

Note 1: Untested and quite possibly bad syntax, as I've been wrestling a bunch of .Net and C++ code for the last couple of weeks.

...roboticus

Insert witty banter here.


In reply to Re^3: Parsing a Large file with no reason by roboticus
in thread Parsing a Large file with no reason by mrras25

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.