For all you monks out there, this question might seem trivial, but has stumped my colleagues and I for a while. We have just received a collection of data back, and need to format it to be used with another program. The data is formated as a 1540132 X 5 matrix. There are 142 samples, and 10846 marker measurements for each sample. Thus, 142 X 10846 = 1540132 lines. The lines are set up in this way: there are 10846 groups of samples, and each group has the samples listed from 1 - 142. Column 1 is the sample ID, column 2 is the marker ID, column 3 is unimportant, and columns 4 and 5 are the two observations for each sample at that marker. Thus, it looks like

JL0001 Cpn_1054417303 420864 C C JL0002 Cpn_1054417303 420864 C C JL0003 Cpn_1054417303 420864 C C JL0004 Cpn_1054417303 420864 C C JL0005 Cpn_1054417303 420864 C C JL0006 Cpn_1054417303 420864 C C JL0007 Cpn_1054417303 420864 C C JL0008 Cpn_1054417303 420864 C C JL0009 Cpn_1054417303 420864 C C JL0010 Cpn_1054417303 420864 C C JL0011 Cpn_1054417303 420864 C C JL0012 Cpn_1054417303 420864 C C JL0013 Cpn_1054417303 420864 C C JL0014 Cpn_1054417303 420864 C C JL0015 Cpn_1054417303 420864 C C JL0016 Cpn_1054417303 420864 C C

What we wish to do is to move the observations for columns 4 and 5 after every 142 lines after the original 142 samples and to only keep the column 1 in the final file along with all of the column 4's and 5's subsequently after each other. The final matrix should be 142 X 21693 (samples X (markers*2 + 1)

JL0001 C C C C C C C C ... JL0002 C C C C C C C C ... JL0003 C C C C C C C C ... JL0004 C C C C C C C C ... JL0005 C C C C C C C C ... JL0006 C C C C C C C C ... JL0007 C C C C C C C C ... JL0008 C C C C C C C C ... JL0009 C C C C C C C C ...
I'd greatly appreciate anyone's help, as you would be doing a great deed for a group in need.

In reply to High Density Data Aid - swapping specific combination of lines/columns repeatedly by Renyulb28

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.