If I understand the OP description and sample data correctly, I think toolic's solution with split would need to change slightly -- either here:
my @tokens = split /\t+/; # split on one or more consecutive tabs
or here:
elsif ((scalar @tokens) == 6) { # if there are 6 fields (and 3 ar +e empty)
But of course you wouldn't want to make both changes, because that wouldn't work.

I think it can be risky to base a solution on just two lines of sample data. In a case like this, we can hope that data lines always come in pairs, that each pair always has the same values in the first two columns, that the first of each pair always has 5 adjacent non-empty fields, that the second always has the 2 "repeated" fields, 3 empty fields and "number=\d+" in a 6th field, that there aren't extra spaces next to any of the field-delimiting tabs, and so on. Wouldn't that be nice...

The question is, what sorts of "deviations" from those patterns do you need to worry about, and what should the script do when those sorts of things pop up (as they almost certainly will)? Just guessing:

use strict; use warnings; my @comp; # open FH in some suitable way... while(<FH>) { s/^\s+//; s/\s+$//; my @flds = split( / *\t */ ); # tabs might have spaces around the +m if ( @flds == 5 ) { # presumably first line of pair warn "Input line $. replaces previous first-line data: @comp\n +" if ( @comp ); @comp = @flds; } elsif ( @flds == 6 and $flds[5] =~ /number=(\d+)/ and $flds[0].$flds[1] eq $comp[0].$comp[1] ) { push @comp, $1; print join( "\t", @comp ), "\n"; @comp = (); } else { warn "Input line $. ignored: $_\n"; } }

In reply to Re: Formatting clue by graff
in thread Formatting clue by cowboy007

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.