**sigh**

It seems we learn something new about your data every day... as opposed to seeing it described concisely at the outset -- even a humanities guy should be able to manage that (I used to be one myself).

So if your file has:

LABEL O 1 blah STOP LABEL C A blah STOP LABEL O 3 blah STOP
What are you supposed to do with that? Put "LABEL O 2" before "LABEL C A"? After it? Instead of it? Don't put it in at all? If you get "LABEL C A" and then "LABEL C C", are you supposed to fill in a "LABEL C B" as well? I suppose you probably have "LABEL X (hex number)" also, and you need to invert their order if the file contains the string "goober"...

Whatever the next wrinkle may be, the answer is most likely "no, you don't need more than one loop". You just need to provide enough "if ... else ... else ..." conditions in the single "while" loop over data blocks in order to cover all the possible scenarios.

(And of course, you need to be able to describe these extra conditions clearly and without ambiguity; if you can't state them coherently so a human can understand them, you won't be able to write code to do it, either. My best advice: document the algorithm first, then code it.)

As for handling the octal stuff, try altering the top of the while loop like this:

while (<IN>) { my $exp_idstr = sprintf( "%o", $expected_id ); if ( s/\(LABEL O $exp_idstr\)\n/$1 NEWSTUFF/ ) { ...

Update: sorry about the flame... and I wanted to add that there could be situations where a second (and maybe even third) pass over the data would simplify the process a lot -- e.g. on one pass, you handle all insertions of missing data blocks; on another pass, you make sure the data blocks are properly sorted; then maybe yet another pass (now that all blocks are present and in order) to add specific new lines of data to specific blocks.

In this case, I would actually recommend that each stage/pass be written as a separate script: keep each script as simple, clear and reliable as possible, in order to do just one thing and do it right. Then run the scripts in succession over the data. (That's what pipeline commands are for: cat input | pass1 | pass2 | pass3 > output.)


In reply to Re^7: adding lines at specific addresses by graff
in thread adding lines at specific addresses by pindar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.