An alternative approach could be to see the parsing of your file as a state machine: initialize $status=0 and then, for each line from the file, check for the "type" of the line (title line,name, kegg, function evidence, process evidence, component evidence, other).

switch on the line type and do as follow:
title line: if $status>0 call the output function (see later); then extract locus tag and name to two vars, initialize $kegg, $function, $process, $component as "unknown", set $status=1.
kegg: strip the "KEGG pathway:" portion of the line and put the remainder in $kegg, set $status=2.
function evidence: set $function='' and $status=3.
process evidence: set $process='' and $status=4.
component evidence: set $component='' and $status=5.
other: depending on the value of $status (between 2 and 5) add the line to the corresponding var. If status<2 do nothing.

At end of file, if $status>0 call again the output function (this is needed to output the last block).

The output function should take the values stored in the 6 vars and print them to the output file

Rule One: Do not act incautiously when confronting a little bald wrinkly smiling man.


In reply to Re: parsing multiple lines by psini
in thread parsing multiple lines by sm2004

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.