Greetings, esteemed monks! Allow this humble pony to drink the sweet nectar of knowledge from the font of your collective wisdom. (Or alternatively, how 'bout some hard cider?)

I need to read a number of files. In each file, each line holds a piece of data, or a marker indicating the beginning or end of a section; I'm interested only in data in a specific section. Normally, I'd do something like this:

foreach my $HANDLE (@HANDLES) { while(<$HANDLE>) { chomp; next unless /^PP_START$/ .. /^PP_END$/; # process line } }

However, it turns out that in these log files, the section end marker may be omitted if there is no following section: the end of the file itself indicates the end of the section then.

This wreaks havoc with the above logic, as the flip-flop operator, not having seen the marker, still evaluates to true when the outer loop moves on to the next file, and wrongly causes lines before the start marker in that file to be processed.

Of course it would be trivial to add a flag indicating whether I'm in the right section, and reset that for each file. But doing that would essentially manually emulate the flip-flop operator, which strikes me as less than elegant. So I'm wondering -- is there a way to "reset" the flip-flop operator, as it were, so that it starts returning false again at the beginning of each new file?

(I know that working sample code/data is appreciated. Give me a moment and I'll cook something up.) Here's a sample script:

#!/usr/bin/perl use Modern::Perl '2014'; my @HANDLES = map { open my $HANDLE, "<", $_ or die "Could not open $_: $!\n"; $HANDLE; } @ARGV; foreach my $HANDLE (@HANDLES) { while(<$HANDLE>) { chomp; next unless /^PP_START$/ .. /^PP_END$/; say; } }

And two sample files (say log1.txt and log2.txt):

uninteresting #1 uninteresting #2 uninteresting #3 TX_START uninteresting #4 uninteresting #5 TX_END PP_START interesting #1 interesting #2

And:

uninteresting #1 uninteresting #2 uninteresting #3 TX_START uninteresting #4 TX_END PP_START interesting #1 interesting #2 interesting #3 PP_END uninteresting #5 uninteresting #6 uninteresting #7

If you pass these in in this order, you'll get:

PP_START interesting #1 interesting #2 uninteresting #1 uninteresting #2 uninteresting #3 TX_START uninteresting #4 TX_END PP_START interesting #1 interesting #2 interesting #3 PP_END

And as you can see, the uninteresting lines from before the PP section in the second file get included in the output.


In reply to Resetting a flip-flop operator by AppleFritter

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.