Hello monks, I have a file containing lines such as the following :
Q3KIL4_PSEPF ONE 134 380 1 252 216.3 6.3e-64 Q3M236_ANAVT TWO 107 563 1 468 203.2 5.3e-60 Q3M236_ANAVT THREE 250 494 1 277 219.1 8.6e-65 Q3M5F5_ANAVT FOUR 296 608 1 355 166.2 7.4e-49 Q3M5F5_ANAVT FIVE 299 584 1 304 188.2 1.7e-55 Q3M7Z1_ANAVT SIX 51 181 1 140 99.0 1.2e-28 Q3MAD2_ANAVT SEVEN 107 508 1 468 350.1 3.3e-104 Q3MAD2_ANAVT EIGHT 230 457 1 277 201.1 2.3e-59 Q3MBT3_ANAVT NINE 203 606 1 468 102.5 1.1e-29 Q3MBT3_ANAVT TEN 326 559 1 277 221.6 1.6e-65 Q3MBT3_ANAVT ELEVEN 134 333 1 234 -334.1 2.7e-44 Q3MD63_ANAVT TWELVE 173 491 1 355 248.5 1.2e-73

I will use the characteristic ID, to describe the following:
Q3KIL4_PSEPF Q3M236_ANAVT Q3M236_ANAVT Q3M5F5_ANAVT Q3M5F5_ANAVT Q3M7Z1_ANAVT Q3MAD2_ANAVT Q3MAD2_ANAVT Q3MBT3_ANAVT Q3MBT3_ANAVT Q3MBT3_ANAVT Q3MD63_ANAVT

and the characteristic EVALUE, to describe the following:
6.30E-064 5.30E-060 8.60E-065 7.40E-049 1.70E-055 1.20E-028 3.30E-104 2.30E-059 1.10E-029 1.60E-065 2.70E-044 1.20E-073
So we are interested mainly in the pair ID->EVALUE.
All the data are separated with tabs.
I want to do 2 things:
1) Create a file that has lines that begin with different ids, i.e combine the lines that begin with the same id, and get:
Q3KIL4_PSEPF ONE 134 380 1 252 216.3 6.3e-64 Q3M236_ANAVT TWO 107 563 1 468 203.2 5.3e-60 T +HREE 250 494 1 277 219.1 8.6e-65 Q3M5F5_ANAVT FOUR 296 608 1 355 166.2 7.4e-49 +FIVE 299 584 1 304 188.2 1.7e-55 Q3M7Z1_ANAVT SIX 51 181 1 140 99.0 1.2e-28 Q3MAD2_ANAVT SEVEN 107 508 1 468 350.1 3.3e-104 + EIGHT 230 457 1 277 201.1 2.3e-59 Q3MBT3_ANAVT NINE 203 606 1 468 102.5 1.1e-29 +TEN 326 559 1 277 221.6 1.6e-65 ELEVEN 134 + 333 1 234 -334.1 2.7e-44 Q3MD63_ANAVT TWELVE 173 491 1 355 248.5 1.2e-73

2)Create a file that holds only one id per line and, if I have more than one lines that begin with the same id, hold the line with the smallest evalue, i.e get the file :
Q3KIL4_PSEPF ONE 134 380 1 252 216.3 6.3e-64 Q3M236_ANAVT THREE 250 494 1 277 219.1 8.6e-65 Q3M5F5_ANAVT FIVE 299 584 1 304 188.2 1.7e-55 Q3M7Z1_ANAVT SIX 51 181 1 140 99.0 1.2e-28 Q3MAD2_ANAVT SEVEN 107 508 1 468 350.1 3.3e-104 Q3MBT3_ANAVT TEN 326 559 1 277 221.6 1.6e-65 Q3MD63_ANAVT TWELVE 173 491 1 355 248.5 1.2e-73

Please, give me some help how to begin and what to use, I am newbie in Perl...
Thank you all in advance!

In reply to how to combine? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.